Remove four fields that are trivially derivable from other fields
already present in the ExecutionContext:
- global_object (from realm)
- global_declarative_environment (from realm)
- identifier_table (from executable)
- property_key_table (from executable)
This shrinks ExecutionContext from 192 to 160 bytes (-17%).
The asmint's GetGlobal/SetGlobal handlers now load through the realm
pointer, taking advantage of the cached declarative environment
pointer added in the previous commit.
Instead of recursing through 5 native stack frames per JS function
call (execute_call -> internal_call -> ordinary_call_evaluate_body ->
run_executable -> run_bytecode), handle Call and CallConstruct for
normal ECMAScript functions directly in the dispatch loop.
The fast path allocates the callee's execution context on the
InterpreterStack, copies arguments, sets up the environment, and
jumps to the callee's bytecode entry point. Return and End unwind
inline frames by restoring the caller's state. Exception unwinding
walks through inline frames to find handlers.
The fast path code is kept in NEVER_INLINE helper functions
(try_inline_call, try_inline_call_construct, pop_inline_frame) to
minimize register pressure in the dispatch loop. handle_exception
takes program_counter by value to avoid forcing it onto the stack.
Reloading of bytecode/program_counter after frame switches is done
inline at each call site via RELOAD_AND_GOTO_START to preserve a
single dispatch entry point for optimal indirect branch prediction.
Replace alloca-based execution context allocation with InterpreterStack
bump allocation across all call sites: bytecode call instructions,
AbstractOperations call/construct, script evaluation, module evaluation,
and LibWeb module script evaluation.
Also replace the native stack space check with an InterpreterStack
exhaustion check, and remove the now-unused alloca macros from
ExecutionContext.h.
After removing the unwind context stack, ExecutionContextRareData only
held two GC::Ptr fields — both trivially destructible. The indirection
cost more than it saved: a GC cell allocation per EC, an extra pointer
chase on every source range lookup, and unnecessary complexity.
Replace the rare data cell with two inline fields on ExecutionContext:
cached_source_range and context_owner.
The runtime unwind context stack was pushed by EnterUnwindContext
and popped by LeaveUnwindContext. With both opcodes removed, it is
no longer read or written by anything.
Remove UnwindInfo, the unwind_contexts vector, its GC visit loop,
its copy in ExecutionContext::copy(), and the VERIFY assertions that
referenced it in handle_exception() and catch_exception().
Replace the saved_lexical_environments stack in ExecutionContextRareData
with explicit register-based environment tracking. Environments are now
stored in registers and restored via SetLexicalEnvironment, making the
environment flow visible in bytecode.
Key changes:
- Add GetLexicalEnvironment and SetLexicalEnvironment opcodes
- CreateLexicalEnvironment takes explicit parent and dst operands
- EnterObjectEnvironment stores new environment in a dst register
- NewClass takes an explicit class_environment operand
- Remove LeaveLexicalEnvironment opcode (instead: SetLexicalEnvironment)
- Remove saved_lexical_environments from ExecutionContextRareData
- Use a reserved register for the saved lexical environment to avoid
dominance issues with lazily-emitted GetLexicalEnvironment
Each finally scope gets two registers (completion_type and
completion_value) that form an explicit completion record. Every path
into the finally body sets these before jumping, and a dispatch chain
after the finally body routes to the correct continuation.
This replaces the old implicit protocol that relied on the exception
register, a saved_return_value register, and a scheduled_jump field
on ExecutionContext, allowing us to remove:
- 5 opcodes (ContinuePendingUnwind, ScheduleJump, LeaveFinally,
RestoreScheduledJump, PrepareYield)
- 1 reserved register (saved_return_value)
- 2 ExecutionContext fields (scheduled_jump, previously_scheduled_jumps)
LibJS+DevTools: Implement console.trace() with source locations
- Add Console::TraceFrame struct with source location data
- Implement Console::trace() to gather stack information
- Add WebView::StackFrame and ConsoleTrace for IPC
- Implement DevToolsConsoleClient::printer() for traces
- Update FrameActor to format traces for DevTools
- Update WorkerDebugConsoleClient trace handling
- Update ReplConsoleClient to format trace output
Every function call allocates an ExecutionContext with a trailing array
of Values for registers, locals, constants, and arguments. Previously,
the constructor would initialize all slots to js_special_empty_value(),
but constant slots were then immediately overwritten by the interpreter
copying in values from the Executable before execution began.
To eliminate this redundant initialization, we rearrange the layout from
[registers | constants | locals] to [registers | locals | constants].
This groups registers and locals together at the front, allowing us to
initialize only those slots while leaving constant slots uninitialized
until they're populated with their actual values.
This reduces the per-call initialization cost from O(registers + locals
+ constants) to O(registers + locals).
Also tightens up the types involved (size_t -> u32) and adds VERIFYs to
guard against overflow when computing the combined slot counts, and to
ensure the total fits within the 29-bit operand index field.
Instead of creating PropertyKeys on the fly during interpreter
execution, we now store fully-formed ones in the Executable.
This avoids a whole bunch of busywork in property access instructions
and substantially reduces code size bloat.
This was causing a fair bit of root registration churn on pages that
throw lots of errors.
Since there's no need for these pointers to float around freely, we can
just visit them during the mark phase as usual.
Instead of using this span, we can just use the getter that calculates
the base of the register/constant/local/argument array based on the
ExecutionContext's own address.
This simplifies function entry/exit and lets us just walk away from the
used ExecutionContext instead of resetting a bunch of its state when
returning control to the caller.
Instead of having ExecutionContext track function names separately,
we give FunctionObject a virtual function that returns an appropriate
name string for use in call stacks.
This reverts commit c14173f651. We
should only annotate the minimum number of symbols that external
consumers actually use, so I am starting from scratch to do that
This way it's always automatically correct, and we don't have to
manually flush it in push_execution_context().
~7% speedup on the MicroBench/call* tests :^)
Instead of letting every [[Call]] implementation allocate an
ExecutionContext, we now make that a responsibility of the caller.
The main point of this exercise is to allow the Call instruction
to write function arguments directly into the callee ExecutionContext
instead of copying them later.
This makes function calls significantly faster:
- 10-20% faster on micro-benchmarks (depending on argument count)
- 4% speedup on Kraken
- 2% speedup on Octane
- 5% speedup on JetStream
By doing that we avoid doing separate allocation for each such vector,
which was really expensive on js heavy websites. For example this change
helps to get EC allocation down from ~17% to ~2% on Google Maps. This
comes at cost of adding extra complexity to custom execution context
allocator, because EC no longer has fixed size and we need to maintain
a list of buckets.
This is better because:
- Better data locality
- Allocate vector for registers+constants+locals+arguments in one go
instead of allocating two vectors separately
The special empty value (that we use for array holes, Optional<Value>
when empty and a few other other placeholder/sentinel tasks) still
exists, but you now create one via JS::js_special_empty_value() and
check for it with Value::is_special_empty_value().
The main idea here is to make it very unlikely to accidentally create an
unexpected special empty value.
Resulting in a massive rename across almost everywhere! Alongside the
namespace change, we now have the following names:
* JS::NonnullGCPtr -> GC::Ref
* JS::GCPtr -> GC::Ptr
* JS::HeapFunction -> GC::Function
* JS::CellImpl -> GC::Cell
* JS::Handle -> GC::Root