Commit Graph

2480 Commits

Author SHA1 Message Date
Andreas Kling
c0e520463e LibJS: Invalidate prototype chains via per-shape child lists
invalidate_all_prototype_chains_leading_to_this used to scan every
prototype shape in the realm and walk each one's chain looking for
the mutated shape. That was O(N_prototype_shapes x chain_depth) per
mutation and showed up hot in real profiles when a page churned a
lot of prototype state during startup.

Each prototype shape now keeps a weak list of the prototype shapes
whose immediate [[Prototype]] points at the object that owns this
shape. The list is registered on prototype-shape creation
(clone_for_prototype, set_prototype_shape) and migrated to the new
prototype shape when the owning prototype object transitions to a
new shape. Invalidation is then a recursive walk over this direct-
child registry, costing O(transitive descendants).

Saves ~300 ms of main thread time when loading https://youtube.com/
on my Linux machine. :^)
2026-04-24 18:59:01 +02:00
Timothy Flynn
12d9aaebb3 LibJS: Remove gc from the global object
No other engine defines this function, so it is an observable difference
of our engine. This traces back to the earliest days of LibJS.

We now define `gc` in just the test-js and test262 runners.
2026-04-24 18:36:23 +02:00
Aliaksandr Kalenik
bfbc3352b5 LibJS: Extend Array.prototype.shift() fast path to holey arrays
indexed_take_first() already memmoves elements down for both Packed and
Holey storage, but the caller at ArrayPrototype::shift() only entered
the fast path for Packed arrays. Holey arrays fell through to the
spec-literal per-element loop (has_property / get / set /
delete_property_or_throw), which is substantially slower.

Add a separate Holey predicate with the additional safety checks the
spec semantics require: default_prototype_chain_intact() (so
HasProperty on a hole doesn't escape to a poisoned prototype) and
extensible() (so set() on a hole slot doesn't create a new own
property on a non-extensible object). The existing Packed predicate
is left unchanged -- packed arrays don't need these checks because
every index in [0, size) is already an own data property.

Allows us to fail at Cloudflare Turnstile way much faster!
2026-04-23 21:47:21 +02:00
Aliaksandr Kalenik
ad7177eccb LibJS: Use memmove in Object::indexed_take_first()
The element-by-element loop compiled to scalar 8-byte moves that the
compiler could not vectorize: source and destination alias, and strict
aliasing prevented hoisting the m_indexed_elements pointer load out of
the loop body. memmove collapses the shift into a single vectorized
copy.
2026-04-23 21:47:21 +02:00
Luke Wilde
3d3b02b9c0 LibJS: Use the real globalThis value
Previously it used `realm.[[GlobalObject]]` instead of
`realm.[[GlobalEnv]].[[GlobalThisValue]]`.

In LibWeb, that corresponds to Window and WindowProxy respectively.
2026-04-23 20:43:01 +01:00
Timothy Flynn
5c34c7f554 Meta: Move python code generators to a subdirectory
Let's have a bit of organization here, rather than an ever-growing Meta
folder.
2026-04-23 07:31:19 -04:00
Andreas Kling
43aecd3f90 LibJS: Skip property-table key marking for non-dictionary shapes
Shape::visit_edges used to walk every entry of m_property_table and
call PropertyKey::visit_edges on each key. For non-dictionary shapes
that work is redundant: m_property_table is a lazily built cache of
the transition chain, and every key it contains was originally
inserted as some ancestor shape's m_property_key, which is already
kept alive via m_previous.

Intrinsic shapes populated through add_property_without_transition()
in Intrinsics.cpp are not dictionaries and have no m_previous to
reach their keys through, but each of those keys is either a
vm.names.* string or a well-known symbol and is strongly rooted by
the VM for its whole lifetime, so skipping them here is safe too.

Measured on the main WebWorker used by https://www.maptiler.com/maps/
this cuts out ~98% of the PropertyKey::visit_edges calls made by
Shape::visit_edges each GC, reducing time spent in GC by ~1.3 seconds
on my Linux PC while initially loading the map.
2026-04-23 02:14:01 +02:00
Andreas Kling
eb9432fcb8 LibJS: Preserve source positions in bytecode source maps
Carry full source positions through the Rust bytecode source map so
stack traces and other bytecode-backed source lookups can use them
directly.

This keeps exception-heavy paths from reconstructing line and column
information through SourceCode::range_from_offsets(), which can spend a
lot of time building SourceCode's position cache on first use.

We're trading some space for time here, but I believe it's worth it at
this tag, as this saves ~250ms of main thread time while loading
https://x.com/ on my Linux machine. :^)

Reading the stored Position out of the source map directly also exposed
two things masked by the old range_from_offsets() path: a latent
off-by-one in Lexer::new_at_offset() (its consume() bumped line_column
past the character at offset; only synthesize_binding_pattern() hit it),
and a (1,1) fallback in range_from_offsets() that fired whenever the
queried range reached EOF. Fix the lexer, then rebaseline both the
bytecode dump tests (no more spurious "1:1") and the destructuring AST
tests (binding-pattern identifiers now report their real columns).
2026-04-22 22:34:54 +02:00
Aliaksandr Kalenik
1f46651af5 LibJS: Reuse cached UTF-16 in Array.prototype.sort's string comparator
CompareArrayElements was calling ToString(x) +
PrimitiveString::create(vm, ...) on every comparison, producing a
fresh PrimitiveString that wrapped the original's AK::String but
carried no cached UTF-16. The subsequent IsLessThan then hit
PrimitiveString::utf16_string_view() on that fresh object, which
re-ran simdutf UTF-8 validation + UTF-8 -> UTF-16 conversion for
both sides on every one of the N log N comparisons.

When x and y are already String Values, ToString(x) and
ToPrimitive(x, Number) are the identity per spec, so we can drop
the IsLessThan detour entirely and compare their Utf16Views
directly. The original PrimitiveString caches its UTF-16 on first
access, so subsequent comparisons against the same element hit
the cache; Utf16View::operator<=> additionally gives us a memcmp
fast path when both sides ended up with short-ASCII UTF-16 storage.

Microbenchmark:
```js
function makeStrings(n) {
    let seed = 1234567;
    const rand = () => {
        seed = (seed * 1103515245 + 12345) & 0x7fffffff;
        return seed;
    };
    const out = new Array(n);
    for (let i = 0; i < n; i++)
        out[i] = "item_" + rand().toString(36)
            + "_" + rand().toString(36);
    return out;
}
const base = makeStrings(100000);
const arr = base.slice();
arr.sort();
```

```
n       before  after   speedup
1k      0.70ms  0.30ms  2.3x
10k     8.33ms  3.33ms  2.5x
50k    49.33ms 17.33ms  2.8x
100k  118.00ms 45.00ms  2.6x
```
2026-04-22 19:12:54 +02:00
Andreas Kling
51758f3022 LibJS: Make bytecode register allocator O(1)
Generator::allocate_register used to scan the free pool to find the
lowest-numbered register and then Vec::remove it, making every
allocation O(n) in the size of the pool. When loading https://x.com/
on my Linux machine, we spent ~800ms in this function alone!

This logic only existed to match the C++ register allocation ordering
while transitioning from C++ to Rust in the LibJS compiler, so now
we can simply get rid of it and make it instant. :^)

So drop the "always hand out the lowest-numbered free register" policy
and use the pool as a plain LIFO stack. Pushing and popping the back
of the Vec are both O(1), and peak register usage is unchanged since
the policy only affects which specific register gets reused, not how
aggressively.
2026-04-21 13:59:55 +02:00
Undefine
e39a8719fd Meta: Move most dependency checks to check_for_dependencies.cmake
This file was here for quite a long while now. Let's finally move most
of the dependency checks to one centralized place.
2026-04-20 16:41:29 -06:00
Andreas Kling
e5d4c5cce8 LibJS: Check TDZ state in asm environment calls
GetCalleeAndThisFromEnvironment treated a binding as initialized when
its value slot was not <empty>. Declarative bindings do not encode TDZ
in that slot, though: uninitialized bindings keep a separate initialized
flag and their value starts as undefined.

That let the first slow-path TDZ failure populate the environment cache,
then a second call at the same site reused the cached coordinate and
turned the required ReferenceError into a TypeError from calling
undefined.

Check Binding.initialized in the asm fast path instead and cover the
cached second-hit case with a regression test.
2026-04-20 11:23:34 +02:00
Yayoi-cs
d8aee7f1e6 LibJS: Refresh TypedArray cached data pointers on shared memory grow
WebAssembly.Memory({shared:true}).grow() reallocates the underlying
AK::ByteBuffer outline (kmalloc+kfree) but, per the threads proposal,
must not detach the associated SharedArrayBuffer.

ArrayBuffer::detach_buffer was the only path that walked m_cached_views
and cleared the cached raw m_data pointer on each TypedArrayBase, so
every existing view retained a dangling pointer into the freed outline.
The AsmInterpreter GetByValue / PutByValue fast paths dereference that
cached pointer directly, yielding a use-after-free triggerable from
JavaScript.

Add ArrayBuffer::refresh_cached_typed_array_view_data_pointers() which
re-derives m_data for each registered view from the current outline
base (and refreshes UnownedFixedLengthByteBuffer::size), and call it
from Memory::refresh_the_memory_buffer on the SAB-fixed-length path
where detach is spec-forbidden.
2026-04-20 09:43:08 +02:00
Timothy Flynn
10ce847931 LibJS+LibUnicode: Use LibUnicode as appropriate for lexing JavaScript
Now that LibUnicode exports its character type APIs in Rust, we can use
them to lex identifiers and whitespace.

Fixes #8870.
2026-04-19 10:39:26 +02:00
Andrew Kaster
f26cb24751 Rust: Add a config file for rustfmt
This sets max_width to 120, which causes a lot of reformatting.
2026-04-18 08:05:47 -04:00
Andreas Kling
530f6fb05c LibJS: Fold nested Rust match conditionals
Move several let/const checks and the `instanceof` keyword check into
match guards.
2026-04-16 22:44:41 +02:00
Andreas Kling
583fa475fb LibJS: Call RawNativeFunction directly from asm Call
The asm interpreter already inlines ECMAScript calls, but builtin calls
still went through the generic C++ Call slow path even when the callee
was a plain native function pointer. That added an avoidable boundary
around hot builtin calls and kept asm from taking full advantage of the
new RawNativeFunction representation.

Teach the asm Call handler to recognize RawNativeFunction, allocate the
callee frame on the interpreter stack, copy the call-site arguments,
and jump straight to the stored C++ entry point.
NativeJavaScriptBackedFunction and other non-raw callees keep falling
through to the existing C++ slow path unchanged.
2026-04-15 15:57:48 +02:00
Andreas Kling
0d9be0feda LibJS: Move arguments [[ParameterMap]] to Shape
Mapped and unmapped arguments objects already use dedicated premade
shapes. Track their [[ParameterMap]] internal slot there instead of
setting an Object flag after construction.

This keeps the information on the shared shape, preserves it through
shape transitions, and still lets Object.prototype.toString()
recognize arguments objects without per-instance bookkeeping.
2026-04-15 15:57:48 +02:00
Andreas Kling
8a9d5ee1a1 LibJS: Separate raw and capturing native functions
NativeFunction previously stored an AK::Function for every builtin,
even when the callable was just a plain C++ entry point. That mixed
together two different representations, made simple builtins carry
capture storage they did not need, and forced the GC to treat every
native function as if it might contain captured JS values.

Introduce RawNativeFunction for plain NativeFunctionPointer callees
and keep AK::Function-backed callables on a CapturingNativeFunction
subclass. Update the straightforward native registrations in LibJS
and LibWeb to use the raw representation, while leaving exported
Wasm functions on the capturing path because they still capture
state.

Wrap UniversalGlobalScope's byte-length strategy lambda in
Function<...> explicitly so it keeps selecting the capturing
NativeFunction::create overload.
2026-04-15 15:57:48 +02:00
Timothy Flynn
4b1ecbc9df LibJS+LibUnicode: Update icu4x's calendar module to 2.2.0
First: We now pin the icu4x version to an exact number. Minor version
upgrades can result in noisy deprecation warnings and API changes which
cause tests to fail. So let's pin the known-good version exactly.

This patch updates our Rust calendar module to use the new APIs. This
initially caused some test failures due to the new Date::try_new API
(which is the recommended replacement for Date::try_new_from_codes)
having quite a limited year range of +/-9999. So we must use other
APIs (Date::try_from_fields and calendrical_calculations::gregorian)
to avoid these limits.

http://github.com/unicode-org/icu4x/blob/main/CHANGELOG.md#icu4x-22
2026-04-14 18:12:31 -04:00
Andreas Kling
23fea4208c LibJS: Pair-load asm global Realm metadata
Place Realm's cached declarative environment next to its global object
so the asm global access fast paths can fetch the two pointers with a
paired load. These handlers never use the intervening GlobalEnvironment
pointer directly.
2026-04-14 12:37:12 +02:00
Andreas Kling
b6c7f6c0c4 LibJS: Cache Executable constants for asm Call
Mirror Executable's constants size and data pointer in adjacent fields
so the asm Call fast path can pair-load them together. The underlying
Vector layout keeps size and data apart, so a small cached raw span
lets the hot constant-copy loop fetch both pieces of metadata at once.
2026-04-14 12:37:12 +02:00
Andreas Kling
5761f6bc54 LibJS: Pair-load PropertyNameIterator index counters
Load PropertyNameIterator's indexed-property count and next index
together when stepping the fast path. Keeping the paired count live
into the named-property case also avoids reloading it before computing
the flattened index.
2026-04-14 12:37:12 +02:00
Andreas Kling
3005945b38 LibJS: Pair-load PropertyNameIterator shape metadata
Load PropertyNameIterator's cached property cache and shape snapshot
together before validating the receiver shape. The two fields already
sit adjacent in the object layout, so the fast path can fetch both
without any extra reshuffling.
2026-04-14 12:37:12 +02:00
Andreas Kling
8dcb2b95ec LibJS: Pair-load asm environment coordinates
Load EnvironmentCoordinate::hops and ::index together in the asm
environment-walk helper. The pair-load keeps the DSL explicit about
which two fields travel together and removes another scalar metadata
fetch from the fast path.
2026-04-14 12:37:12 +02:00
Andreas Kling
acbbb2d726 LibJS: Pair-load property IC lookup metadata
Load the cached property offset and dictionary generation with paired
loads in the property inline-cache fast paths. AsmIntGen now verifies
these reads against the actual cache layout, so the DSL keeps both
fields named and self-documenting.
2026-04-14 12:37:12 +02:00
Andreas Kling
335c278b8f LibJS: Pair-load property IC shape metadata
Load the cached shape and prototype pointer together in the property
inline-cache fast paths that already read both. This keeps the
cache-entry metadata fetches aligned with the DSL's paired-load model
without changing the surrounding control flow.
2026-04-14 12:37:12 +02:00
Andreas Kling
75eb3a28ce LibJS: Pair-load asm Return resume metadata
Load the inline frame's return pc and destination register at once when
Return or End resumes an asm-managed caller. This keeps the unwind
metadata with the helper that consumes it and removes a separate scalar
load from both handlers.
2026-04-14 12:37:12 +02:00
Andreas Kling
58aa725afb LibJS: Reuse executable state in asm Call
The asm Call fast path was still reloading the executable pointer while
building the inline callee frame, even though it had already loaded the
same pointer while validating the call target.

Carry that executable pointer through frame setup and reload the passed
argument count from the call bytecode instead of the fresh frame header.
This trims a couple more loads from the hot path.
2026-04-14 12:37:12 +02:00
Andreas Kling
517812647a LibJS: Pack asm Call shared-data metadata
Pack the asm Call fast path metadata next to the executable pointer
so the interpreter can fetch both values with one paired load. This
removes several dependent shared-data loads from the hot path.

Keep the executable pointer and packed metadata in separate registers
through this binding so the fast path can still use the paired-load
layout after any non-strict this adjustment.

Lower the packed metadata flag checks correctly on x86_64 as well.
Those bits now live above bit 31, so the generator uses bt for single-
bit high masks and covers that path with a unit test.

Add a runtime test that exercises both object and global this binding
through the asm Call fast path.
2026-04-14 12:37:12 +02:00
Andreas Kling
50c497c59b LibJS: Use precomputed asm Call frame counts
Executable already caches the combined registers, locals, and constants
count that the asm Call fast path needs for inline frame allocation.

Use that precomputed total instead of rebuilding it from the registers
count and constants vector size in the hot path.
2026-04-14 12:37:12 +02:00
Andreas Kling
fffc16b2f6 LibJS: Trust inline-call bytecode in asm Call
The asm Call fast path already checks SharedFunctionInstanceData's
cached can_inline_call bit before touching the executable pointer.

That cache is only true for ordinary functions with compiled bytecode,
so the extra executable null check is redundant work in the hot path.
2026-04-14 12:37:12 +02:00
Andreas Kling
44deea24fe LibJS: Pair-load asm Call stack bounds
The asm Call fast path reads InterpreterStack::m_top and m_limit
back-to-back while checking whether the inline callee frame fits.

Those fields are adjacent, so we can load them together with one
paired load and keep the stack-size check otherwise unchanged.
2026-04-14 12:37:12 +02:00
Andreas Kling
fa931612e1 LibJS: Pair-store the asm Call frame setup
Teach the asm Call fast path to use paired stores for the fixed
ExecutionContext header writes and for the caller linkage fields.
This also initializes the five reserved Value slots directly instead
of looping over them as part of the general register clear path.

That keeps the hot frame setup work closer to the actual data layout:
reserved registers are seeded with a couple of fixed stores, while the
remaining register and local slots are cleared in wider chunks.

On x86_64, keep the new explicit-offset formatting on store_pair*
and load_pair* without changing ordinary [base, index, scale]
operands into base-plus-index-plus-offset addresses. Add unit
tests covering both the paired zero-offset form and the preserved
scaled-index lowering.
2026-04-14 12:37:12 +02:00
Andreas Kling
fcbbc6a4b8 LibJS: Add paired stores to the AsmInt DSL
Teach AsmIntGen about store_pair32 and store_pair64 so hot handlers
can describe adjacent writes just as explicitly as adjacent reads.
The DSL now requires naming both memory operands and rejects
non-adjacent or reordered pairs at code generation time.

On aarch64 the new instructions lower to stp when the address is
encodable, while x86_64 keeps the same semantics with two scalar
stores. The shared validation keeps the paired access rules consistent
across both load and store primitives.
2026-04-14 12:37:12 +02:00
Andreas Kling
8ae173f4fd LibJS: Use paired loads in the asm Call fast path
Use the new paired-load DSL operations in the inline Call path for the
adjacent environment, ScriptOrModule, caller metadata, and callee-entry
loads. The flow stays the same, but the hot call setup now needs fewer
scalar memory operations on aarch64.
2026-04-14 12:37:12 +02:00
Andreas Kling
ce753047b0 LibJS: Add verifiable paired loads to the AsmInt DSL
Add load_pair32 and load_pair64 to the AsmInt DSL and make the
generator verify that both named memory operands are truly adjacent.
That keeps paired loads self-documenting in the DSL instead of
hiding the second field behind an implicit adjacency assumption.

AArch64 now lowers valid pairs to ldp when the address form allows
it, while x86_64 keeps the same behavior with two obvious scalar loads.
Add unit tests for the shared validator so reversed or non-adjacent
field pairs are rejected during code generation.
2026-04-14 12:37:12 +02:00
Andreas Kling
8c7c46f8ec LibJS: Inline asm interpreter JS Call fast path
Handle inline-eligible JS-to-JS Call directly in asmint.asm instead
of routing the whole operation through AsmInterpreter.cpp.

The asm handler now validates the callee, binds `this` for the
non-allocating cases, reserves the callee InterpreterStack frame,
populates the ExecutionContext header and Value tail, and enters the
callee bytecode at pc 0.

Keep the cases that need NewFunctionEnvironment() or sloppy `this`
boxing on a narrow helper that still builds an inline frame. This
preserves the existing inline-call semantics for promise-job ordering,
receiver binding, and sloppy global-this handling while keeping the
common path in assembly.

Add regression coverage for closure-capturing callees, sloppy
primitive receivers, and sloppy undefined receivers.
2026-04-14 08:14:43 +02:00
Andreas Kling
7a01a64087 LibJS: Expose asmint Call offset metadata
Emit the ExecutionContext, function-object, executable, and realm
offsets that the asm Call path needs to inspect and initialize
directly when building inline frames.
2026-04-14 08:14:43 +02:00
Andreas Kling
4405c52042 LibJS: Zero-extend 32-bit AArch64 asm immediates
Teach the AArch64 AsmInt generator to materialize immediates through
w-register writes when the upper 32 bits are known zero.

That keeps the same x-register value while letting common constants
use shorter instruction sequences.
2026-04-14 08:14:43 +02:00
Andreas Kling
960a36db53 LibJS: Lower zero store immediates to zero registers on AArch64
Teach the AArch64 AsmInt generator to lower zero-immediate stores
through xzr or wzr instead of materializing a temporary register.

This covers store64 as well as the narrow store8, store16, and
store32 forms, keeping the generated code shorter on the zero
store fast path.
2026-04-14 08:14:43 +02:00
Andreas Kling
87797e9161 LibJS: Use tbz and tbnz for single-bit asm branches
AsmIntGen already lowers branch_zero and branch_nonzero to the compact
AArch64 branch-on-bit forms when possible, but branch_bits_set and
branch_bits_clear still expanded single-bit immediates into tst plus a
separate conditional branch.

Teach the AArch64 backend to recognize power-of-two masks and emit
tbnz or tbz directly. This shortens several hot interpreter paths.
2026-04-14 08:14:43 +02:00
Andreas Kling
12a916d14a LibJS: Handle AsmInt returns without C++ helpers
Handle Return and End entirely in AsmInt when leaving an inline frame.
The handlers now restore the caller, update the interpreter stack
bookkeeping directly, and bump the execution generation without
bouncing through AsmInterpreter.cpp.

Add WeakRef tests that exercise both inline Return and inline End
so this path stays covered.
2026-04-14 08:14:43 +02:00
Andreas Kling
b1dab18e42 LibJS: Teach AsmIntGen helper primitives
Add load_vm, memory-operand macro substitution, and a generic
inc32_mem instruction to the AsmInt DSL.

Also drop redundant mov reg, reg copies in the backends so handlers
that use the new helpers expand to cleaner assembly.
2026-04-14 08:14:43 +02:00
Andreas Kling
df0fdee2a0 LibJS: Cache JS-to-JS inline call eligibility
Store whether a function can participate in JS-to-JS inline calls on
SharedFunctionInstanceData instead of recomputing the function kind,
class-constructor bit, and bytecode availability at each fast-path
call site.
2026-04-14 08:14:43 +02:00
Timothy Flynn
bf15cabbc0 LibJS: Remove Utf16String.h
This was removed in a43cb15e81 and then
mistakenly re-added in 539a675802.
2026-04-13 15:26:48 -04:00
Andreas Kling
c301a21960 LibJS: Skip preserving zero-argument call callees
The callee and this-value preservation copies only matter while later
argument expressions are still being evaluated. For zero-argument calls
there is nothing left to clobber them, so we can keep the original
operand and let the interpreter load it directly.

This removes the hot Mov arg0->reg pattern from zero-argument local
calls and reduces register pressure.
2026-04-13 18:29:43 +02:00
Andreas Kling
3a08f7b95f LibJS: Drop dead entry GetLexicalEnvironment loads
Teach the Rust bytecode generator to treat the synthetic entry
GetLexicalEnvironment as a removable prologue load.

We still model reg4 as the saved entry lexical environment during
codegen, but assemble() now deletes that load when no emitted
instruction refers to the saved environment register. This keeps the
semantics of unwinding and environment restoration intact while letting
empty functions and other simple bodies start at their first real
instruction.
2026-04-13 18:29:43 +02:00
Andreas Kling
9af5508aef LibJS: Split inline frames from execution context stack
Keep JS-to-JS inline calls out of m_execution_context_stack and walk
the active stack from the running execution context instead. Base
pushes now record the previous running context so duplicate
TemporaryExecutionContext pushes and host re-entry still restore
correctly.

This keeps the fast JS-to-JS path off the vector without losing GC
root collection, stack traces, or helpers that need to inspect the
active execution context chain.
2026-04-13 18:29:43 +02:00
Andreas Kling
2ca7dfa649 LibJS: Move bytecode interpreter state to VM
The bytecode interpreter only needed the running execution context,
but still threaded a separate Interpreter object through both the C++
and asm entry points. Move that state and the bytecode execution
helpers onto VM instead, and teach the asm generator and slow paths to
use VM directly.
2026-04-13 18:29:43 +02:00