Commit Graph

1376 Commits

Author SHA1 Message Date
Jelle Raaijmakers
e123d48043 AK: Add SentinelOptional
We specialize `Optional<T>` for value types that inherently support some
kind of "empty" value or whose value range allow for a unlikely to be
useful sentinel value that can mean "empty", instead of the boolean flag
a regular Optional<T> needs to store. Because of padding, this often
means saving 4 to 8 bytes per instance.

By extending the new `SentinelOptional<T, Traits>`, these
specializations are significantly simplified to just having to define
what the sentinel value is, and how to identify a sentinel value.
2026-03-20 12:03:36 +01:00
Andreas Kling
362207b45d LibJS: Remove remaining C++ pipeline artifacts
Clean up leftover references to the removed C++ pipeline:

- Remove stale forward declarations from Forward.h (ASTNode,
  Parser, Program, FunctionNode, ScopeNode, etc.)
- Delete unused FunctionParsingInsights.h
- Remove dead get_builtin(MemberExpression const&) declaration
  from Builtins.h
- Update stale comments referencing ASTCodegen.cpp and
  generate_bytecode()
2026-03-19 21:55:10 -05:00
Andreas Kling
30f108ba36 LibJS: Remove C++ lexer, use Rust tokenizer for syntax highlighting
Delete Lexer.cpp/h and Token.cpp, replacing all tokenization with a
new rust_tokenize() FFI function that calls back for each token.

Rewrite SyntaxHighlighter.cpp and js.cpp REPL to use the Rust
tokenizer. The token type and category enums in Token.h now mirror
the Rust definitions in token.rs.

Move is_syntax_character/is_whitespace/is_line_terminator helpers
into RegExpConstructor.cpp as static functions, since they were only
used there.
2026-03-19 21:55:10 -05:00
Andreas Kling
8ec7e7c07c LibJS: Remove C++ AST
Delete AST.cpp, AST.h, ASTDump.cpp, ScopeRecord.h, and the dead
get_builtin(MemberExpression const&) from Builtins.cpp.

Extract ImportEntry and ExportEntry into a new ModuleEntry.h,
since they are data types used by the module system, not AST
node types.

Inline ModuleRequest's sorting constructor and
SourceRange::filename().

Remove the dead annex_b_function_declarations field from
EvalDeclarationData, which was only populated by the C++ parser.
2026-03-19 21:55:10 -05:00
Andreas Kling
1f6ca58e55 LibJS: Remove C++ AST constructor from SharedFunctionInstanceData
Remove the constructor that took C++ AST nodes (FunctionParameters,
Statement), along with create_for_function_node() and the
m_formal_parameters / m_ecmascript_code fields. These were only used
by the now-removed C++ compilation pipeline.

Also remove the dead EvalDeclarationData::create(VM&, Program&, bool)
and ECMAScriptFunctionObject::ecmascript_code() accessor.
2026-03-19 21:55:10 -05:00
Andreas Kling
272562ddc5 LibJS: Remove dead C++ bytecode compilation functions
Remove Bytecode::compile() and the old create() overloads on
ECMAScriptFunctionObject that accepted C++ AST nodes. These
have no remaining callers now that all compilation goes through
the Rust pipeline.

Also remove the if-constexpr Parse Node branch from
async_block_start, since the Statement template instantiation
was already removed.

Fix transitive include dependencies on Generator.h by adding
explicit includes for headers that were previously pulled in
transitively.
2026-03-19 21:55:10 -05:00
Andreas Kling
3518efd71c LibJS+LibWeb: Port remaining callers to Rust pipeline
Port all remaining users of the C++ Parser/Lexer/Generator to
use the Rust pipeline instead:

- Intrinsics: Remove C++ fallback in parse_builtin_file()
- ECMAScriptFunctionObject: Remove C++ compile() fallback
- NativeJavaScriptBackedFunction: Remove C++ compile() fallback
- EventTarget: Port to compile_dynamic_function
- WebDriver/ExecuteScript: Port to compile_dynamic_function
- LibTest/JavaScriptTestRunner.h: Remove Parser/Lexer includes
- FuzzilliJs: Remove unused Parser/Lexer includes

Also remove the dead Statement-based template instantiation of
async_block_start/async_function_start.
2026-03-19 21:55:10 -05:00
Andreas Kling
77cd434710 LibJS: Remove C++ compiler pipeline fallback paths
Now that the Rust pipeline is the sole compilation path, remove all
C++ parser/codegen fallback paths from the callers:

- Script::parse() no longer falls back to C++ Parser
- SourceTextModule::parse() no longer falls back to C++ Parser
- perform_eval() no longer falls back to C++ Parser + Generator
- create_dynamic_function() no longer falls back to C++ Parser
- ShadowRealm eval no longer falls back to C++ Parser + Generator
- Interpreter::run(Script&) no longer falls back to Generator

Also remove the now-dead old constructors that took C++ AST nodes,
the module_requests() helper, and AST dump code from js.cpp.
2026-03-19 21:55:10 -05:00
Andreas Kling
2c45472a11 LibJS: Remove pipeline comparison infrastructure
Remove PipelineComparison.cpp/h and all LIBJS_COMPARE_PIPELINES
support from RustIntegration.cpp. This includes:

- The compare_pipelines_enabled() function
- All comparison blocks in compile_script/eval/module/function
- The pair_shared_function_data() helper
- The m_cpp_comparison_sfd field on SharedFunctionInstanceData

The Rust pipeline has been validated extensively through comparison
testing and no longer needs the side-by-side verification harness.
2026-03-19 21:55:10 -05:00
Andreas Kling
9299d430c8 LibJS: Cache typed array data pointers for indexed access
Cache raw data pointers on fixed-length typed array views so asm
GetByValue and PutByValue can use them directly for indexed
element access.

Replace the asm typed-array hot-path
ArrayBuffer/DataBlock/ByteBuffer walk with one cached_data_ptr load.
Remove six unconditional loads, four branches, and the byte_offset
add before the element access, trading them for one
cached_data_ptr null check.

Keep direct C++ typed-array access on IsValidIntegerIndex-based
checks, invalidate cached pointers eagerly when a backing
ArrayBuffer is detached, and add regression coverage for shrink,
regrow, and detach on number and BigInt typed arrays.
2026-03-18 13:59:05 -05:00
Andreas Kling
b4185f0ecd LibJS: Split packed and holey asm indexed fast paths
Use dedicated Packed branches in GetByValue and PutByValue so
in-bounds indexed accesses can skip hole checks and slot
reloads.

Keep Holey writes on the guarded arm, and keep append writes on
the C++ slow path so PutByValue still respects non-extensible
indexed objects and arrays with a non-writable length.

Add a bytecode regression that exercises both append failure
cases through the real js binary path.
2026-03-17 22:28:35 -05:00
Andreas Kling
5f586ae406 LibJS: Promote Holey arrays to Packed when all holes are filled
Arrays created via new Array(N) or by setting .length start as Holey
since their elements are not present. After sequential fill (e.g.
for (i=0; i<N; i++) a[i]=v), all holes are filled but the array
remained Holey, preventing the Packed fast paths in the asm
interpreter from triggering.

Now, whenever indexed_put() writes to the last index of a Holey
array, we scan for remaining holes and promote to Packed if none
are found. Only checking on writes to the last index avoids O(N^2)
scanning on partial fills while still catching the common
sequential fill pattern.
2026-03-17 22:28:35 -05:00
Andreas Kling
5895cacc21 LibJS: Add Array.prototype fast paths for packed arrays
When the receiver is an Array with packed storage and an intact default
prototype chain, some methods can skip the generic property access
machinery and operate directly on the indexed element storage.

This patch adds fast paths for push(), pop(), concat(), slice() and
splice().
2026-03-17 22:28:35 -05:00
Andreas Kling
614713ed08 LibJS: Replace IndexedProperties with inline Packed/Holey/Dictionary
Replace the OwnPtr<IndexedPropertyStorage> indirection with inline
indexed element storage directly on Object. This eliminates virtual
dispatch and reduces indirection for indexed property access.

The new system uses three storage kinds tracked by IndexedStorageKind:

- Packed: Dense array, no holes. Elements stored in a malloced Value*
  array with capacity header (same layout as named properties).
- Holey: Dense array with possible holes marked by empty sentinel.
  Same physical layout as Packed.
- Dictionary: Sparse storage using GenericIndexedPropertyStorage,
  type-punned into the m_indexed_elements pointer.

Transitions: None->Packed->Holey->Dictionary (mostly monotonic).
Dictionary mode triggers on non-default attributes or sparse arrays.

Object keeps the same 48-byte size since m_indexed_elements (8 bytes)
replaces IndexedProperties (8 bytes), and the storage kind + array
size fit in existing padding alongside m_flags.

The asm interpreter benefits from one fewer indirection: it now reads
the element pointer and array size directly from Object fields instead
of chasing through OwnPtr -> IndexedPropertyStorage -> Vector.

Removes: IndexedProperties, SimpleIndexedPropertyStorage,
IndexedPropertyStorage, IndexedPropertyIterator.
Keeps: GenericIndexedPropertyStorage (for Dictionary mode).
2026-03-17 22:28:35 -05:00
Andreas Kling
f574ef528d LibJS: Replace Vector<Value> with Value* for named property storage
Replace the 24-byte Vector<Value> m_storage with an 8-byte raw
Value* m_named_properties pointer, backed by a malloc'd allocation
with an inline capacity header.

Memory layout of the allocation:
  [u32 capacity] [u32 padding] [Value 0] [Value 1] ...
  m_named_properties points to Value 0.

This shrinks JS::Object from 64 to 48 bytes (on non-Windows
platforms) and removes one level of indirection for property access
in the asm interpreter, since the data pointer is now stored directly
on the object rather than inside a Vector's internal metadata.

Growth policy: max(4, max(needed, old_capacity * 2)).
2026-03-17 22:28:35 -05:00
Andreas Kling
3e1145ef07 LibJS: Synchronous await fast path when microtask queue is empty
When an async function is resumed from a microtask and hits another
await with a non-thenable value (primitive or already-settled native
promise), and the microtask queue is empty, we can resolve the await
synchronously without suspending. No other microtask can observe the
difference in execution order, making this optimization safe.

This avoids the overhead of creating a GC::Function for the microtask
job, enqueuing/dequeuing from the microtask queue, and the execution
context push/pop that comes with it.

A new VM host hook, host_promise_job_queue_is_empty, is added so both
the standalone js binary and LibWeb can provide the appropriate check
for their respective job queue implementations.
2026-03-16 19:15:03 -05:00
Andreas Kling
3a2f2f3926 LibJS: Add fast path in async function await for non-thenable values
Per spec, every `await` goes through PromiseResolve (which wraps the
value in a new Promise via NewPromiseCapability) and then
PerformPromiseThen (which creates PromiseReaction and JobCallback
objects). This results in 13-16 GC cell allocations per await.

Add a fast path that detects two common cases:

1. Primitive values: These can never have a "then" property, so we
   can skip all promise wrapping and directly schedule the async
   function's continuation as a microtask.

2. Already-settled native Promises: If the promise has no own
   properties and its prototype is the intrinsic %Promise.prototype%,
   we can extract the result directly and schedule continuation.

For these cases, we bypass promise_resolve(), new_promise_capability(),
create_resolving_functions(), perform_then(), PromiseReaction creation,
and JobCallback creation -- replacing ~13 GC allocations with 1
(the GC::Function for the microtask job).
2026-03-16 12:01:49 -05:00
Shannon Booth
b34274a2a0 LibJS: Combine onFulfilled and onRejected into a single settled callback
The fulfilled and rejected closures in AsyncFunctionDriverWrapper::await
were identical in structure, differing only in whether they resumed the
async context with a NormalCompletion or ThrowCompletion.

Since the callback has access to the current promise, we can check its
state at reaction time to determine which completion to use. This allows
us to allocate a single GC-tracked NativeFunction (m_on_settled) and
register it for both the onFulfilled and onRejected slots in
PerformPromiseThen, halving the number of GC allocations on this path.
2026-03-16 11:18:35 -05:00
Timothy Flynn
776134ce03 LibJS+LibUnicode: Add an API to loop over Unicode extensions of one type 2026-03-14 08:17:03 -04:00
Timothy Flynn
d094de39fc LibJS: Format "islamic" and "islamic-rgsa" calendars as "islamic-tbla"
This was missed when implementing the Intl Era and Month Code proposal.
2026-03-13 14:43:45 -04:00
Timothy Flynn
b47e4acc96 LibJS: Preserve the original time zone identifier in Intl.DateTimeFormat
This was missed when updating the Intl.DateTimeFormat constructor to
support Temporal.
2026-03-13 14:43:45 -04:00
Timothy Flynn
236037730f LibJS: Update spec steps for the Intl Locale info proposal
This proposal has reached stage 4 and was merged into ECMA-402. See:
https://github.com/tc39/ecma402/commit/70b0ecc
2026-03-13 14:42:51 -04:00
Timothy Flynn
101fee6cb2 LibJS+LibUnicode: Rename a LibUnicode FFI function for clarity
There will be an extremely similar function that accepts a calendar year
rather than ISO year in a subsequent commit. Rename the ISO year variant
so that they are more distinguishable.
2026-03-12 17:29:59 -05:00
Timothy Flynn
397be77866 LibJS+LibUnicode: Migrate MonthCode and its utilities to LibUnicode
Will be used in a Chinese/Dangi calendar implementation.
2026-03-12 17:29:59 -05:00
Tim Ledbetter
36f74ba96c Revert "LibJS: Shrink ExecutionContext by replacing ScriptOrModule …"
… with Cell*.

This reverts commit d3495c62a7.
2026-03-11 23:13:18 +00:00
Andreas Kling
d3495c62a7 LibJS: Shrink ExecutionContext by replacing ScriptOrModule with Cell*
Replace the 16-byte Variant<Empty, GC::Ref<Script>, GC::Ref<Module>>
with a simple 8-byte GC::Ptr<Cell> that points to either a Script or
Module (or is null for Empty).

A helper function script_or_module_from_cell() converts back to the
full ScriptOrModule variant when needed (e.g. in
VM::get_active_script_or_module).
2026-03-11 13:33:47 +01:00
Andreas Kling
c8ad07dece LibJS: Remove unused caller_executable from ExecutionContext
This field was written by push_inline_frame but never read anywhere.
The caller's executable is accessible via caller_frame->executable
if ever needed.

Shrinks ExecutionContext from 120 to 112 bytes.
2026-03-11 13:33:47 +01:00
Andreas Kling
5f463ed989 LibJS: Replace arguments Span with argument_count in ExecutionContext
The arguments Span (pointer + size = 16 bytes) was always derivable
from the tail array layout: data = values + (total_count - arg_count).

Replace it with a u32 argument_count and derive the span on demand
via arguments_span() / arguments_data() accessors.

Shrinks ExecutionContext from 136 to 120 bytes.
2026-03-11 13:33:47 +01:00
Andreas Kling
75e7bc1e2a LibJS: Move source range cache from ExecutionContext to Executable
CachedSourceRange was a GC-allocated cell stored on the
ExecutionContext, only needed because ExecutionContext must be
trivially destructible.

Move the source range cache to a HashMap<u32, SourceRange> on the
Executable (keyed by program counter), where it belongs. This
eliminates the GC::Cell subclass entirely and removes the
cached_source_range field from ExecutionContext.

StackTraceElement and TracebackFrame now store Optional<SourceRange>
directly instead of GC::Ptr<CachedSourceRange>.

Shrinks ExecutionContext from 144 to 136 bytes.
2026-03-11 13:33:47 +01:00
Andreas Kling
f02b67a700 LibJS: Remove context_owner from ExecutionContext
This field was only used by LibWeb to prevent GC collection of the
EnvironmentSettingsObject while its execution context is on the stack.

This is unnecessary because the ESO is already reachable through the
realm's host_defined pointer: EC -> realm -> host_defined ->
PrincipalHostDefined -> environment_settings_object.

Shrinks ExecutionContext from 152 to 144 bytes.
2026-03-11 13:33:47 +01:00
Andreas Kling
d0f8d56224 LibJS: Reorder ExecutionContext fields to eliminate padding
Group the u32 fields (passed_argument_count, caller_return_pc,
caller_dst_raw) together so they pack naturally without alignment
padding between pointer-sized fields.

This shrinks ExecutionContext from 160 to 152 bytes.
2026-03-11 13:33:47 +01:00
Andreas Kling
96d02d5249 LibJS: Remove derivable fields from ExecutionContext
Remove four fields that are trivially derivable from other fields
already present in the ExecutionContext:

- global_object (from realm)
- global_declarative_environment (from realm)
- identifier_table (from executable)
- property_key_table (from executable)

This shrinks ExecutionContext from 192 to 160 bytes (-17%).

The asmint's GetGlobal/SetGlobal handlers now load through the realm
pointer, taking advantage of the cached declarative environment
pointer added in the previous commit.
2026-03-11 13:33:47 +01:00
Andreas Kling
e70f580e5c LibJS: Cache global declarative environment pointer in Realm
Realm now caches a direct pointer to the global declarative
environment record, updated when set_global_environment() is called.
This avoids an extra pointer chase through GlobalEnvironment in hot
paths like the asmint's GetGlobal/SetGlobal handlers.
2026-03-11 13:33:47 +01:00
Timothy Flynn
86c8a57794 LibJS+LibUnicode: Use icu4x for Temporal calendar operations
Replace the icu4c-based calendar implementation with one built on the
icu4x Rust crate (icu_calendar).

The icu4c API does not expose the píngqì month-assignment algorithm
used by the Chinese and Dangi lunisolar calendars. Our old code had to
approximate this by walking months via epoch millisecond arithmetic and
manually tracking leap month positions, which produced incorrect month
codes and ordinal month numbers for certain years. The icu4x calendar
crate handles píngqì natively.

With this patch, which is almost a 1-to-1 mapping of ICU invocations, we
pass 100% of all Temporal test262 tests.

The end goal might be to use icu4x for all of our ICU needs. But it does
not yet provide the APIs needed for all ECMA-402 prototypes.
2026-03-11 07:09:57 -04:00
Timothy Flynn
b800c97ab8 LibJS+LibUnicode: Add support for non-ISO-8601 Temporal calendars
This adds international calendar support to our Temporal implementation,
using the Intl Era and Month Code Proposal as a guide. See:

https://tc39.es/proposal-intl-era-monthcode/
2026-03-09 11:40:59 +01:00
Timothy Flynn
a41f2f56a8 LibJS+LibUnicode: Migrate some Temporal calendar types to LibUnicode
These will be needed for calendar operations involving ICU.
2026-03-09 11:40:59 +01:00
Timothy Flynn
b322d0fe84 LibJS: Add an infallible override of ParseMonthCode
This will be useful to avoid needing to pass around a VM in a bunch of
AOs in a future commit.
2026-03-09 11:40:59 +01:00
Timothy Flynn
2e74b91ca1 LibJS: Pass calendar strings around as String more regularly
Same as commit f9fa548d43.

These are String from the outset, so this patch is almost entirely just
changing function parameter types. This will allow us to cache calendar
objects in ICU without invoking any extra allocations.
2026-03-09 11:40:59 +01:00
Timothy Flynn
88365031f2 LibJS+LibUnicode: Implement support for handling gaps in time zones 2026-03-09 11:40:59 +01:00
Timothy Flynn
544e6ee3bb LibJS: Use the correct error type for invalid time zone names 2026-03-09 11:40:59 +01:00
Timothy Flynn
aa435bdd7d LibJS: Correctly parse time zones with a negative GMT offset 2026-03-09 11:40:59 +01:00
Andreas Kling
fe48e27a05 LibJS: Replace GC::Weak with GC::RawPtr in inline cache entries
Property lookup cache entries previously used GC::Weak<T> for shape,
prototype, and prototype_chain_validity pointers. Each GC::Weak
requires a ref-counted WeakImpl allocation and an extra indirection
on every access.

Replace these with GC::RawPtr<T> and make Executable a WeakContainer
so the GC can clear stale pointers during sweep via remove_dead_cells.

For static PropertyLookupCache instances (used throughout the runtime
for well-known property lookups), introduce StaticPropertyLookupCache
which registers itself in a global list that also gets swept.

Now that inline cache entries use GC::RawPtr instead of GC::Weak,
we can compare shape/prototype pointers directly without going
through the WeakImpl indirection. This removes one dependent load
from each IC check in GetById, PutById, GetLength, GetGlobal, and
SetGlobal handlers.
2026-03-08 10:27:13 +01:00
Andreas Kling
c5427e5f4e LibJS: Convert Object bitfields to a flags byte
Replace individual bool bitfields in Object (m_is_extensible,
m_has_parameter_map, m_has_magical_length_property, etc.) with a
single u8 m_flags field and Flag:: constants.

This consolidates 8 scattered bitfields into one byte with explicit
bit positions, making them easy to access from generated assembly
code at a known offset. It also converts the virtual is_function()
and is_ecmascript_function_object() methods to flag-based checks,
avoiding virtual dispatch for these hot queries.

ProxyObject now explicitly clears the IsFunction flag in its
constructor when wrapping a non-callable target, instead of relying
on a virtual is_function() override.
2026-03-07 13:09:59 +01:00
Andreas Kling
27fa0aac98 LibJS: Inline JS-to-JS calls in the bytecode interpreter dispatch loop
Instead of recursing through 5 native stack frames per JS function
call (execute_call -> internal_call -> ordinary_call_evaluate_body ->
run_executable -> run_bytecode), handle Call and CallConstruct for
normal ECMAScript functions directly in the dispatch loop.

The fast path allocates the callee's execution context on the
InterpreterStack, copies arguments, sets up the environment, and
jumps to the callee's bytecode entry point. Return and End unwind
inline frames by restoring the caller's state. Exception unwinding
walks through inline frames to find handlers.

The fast path code is kept in NEVER_INLINE helper functions
(try_inline_call, try_inline_call_construct, pop_inline_frame) to
minimize register pressure in the dispatch loop. handle_exception
takes program_counter by value to avoid forcing it onto the stack.
Reloading of bytecode/program_counter after frame switches is done
inline at each call site via RELOAD_AND_GOTO_START to preserve a
single dispatch entry point for optimal indirect branch prediction.
2026-03-04 18:53:12 +01:00
Andreas Kling
4e0e16e510 LibJS+LibWeb: Use InterpreterStack for all execution context allocation
Replace alloca-based execution context allocation with InterpreterStack
bump allocation across all call sites: bytecode call instructions,
AbstractOperations call/construct, script evaluation, module evaluation,
and LibWeb module script evaluation.

Also replace the native stack space check with an InterpreterStack
exhaustion check, and remove the now-unused alloca macros from
ExecutionContext.h.
2026-03-04 18:53:12 +01:00
Andreas Kling
0c5e4ebc18 LibJS: Add InterpreterStack bump allocator for execution contexts
Add a managed bump allocator backed by an 8 MB mmap region to replace
alloca-based execution context allocation. Pages are committed on
demand by the OS, so only the memory actually touched is resident.

The InterpreterStack is stored as a member of VM and provides simple
LIFO allocation and deallocation of ExecutionContext frames.
2026-03-04 18:53:12 +01:00
Andreas Kling
56e09695e0 LibJS: Consolidate Put bytecode instructions and reduce code bloat
Replace 20 separate Put instructions (5 PutKinds x 4 forms) with
4 unified instructions (PutById, PutByIdWithThis, PutByValue,
PutByValueWithThis), each carrying a PutKind field at runtime instead
of being a separate opcode.

This reduces the number of handler entry points in the dispatch loop
and eliminates template instantiations of put_by_property_key and
put_by_value that were being duplicated 5x each when inlined by LTO.
2026-03-04 18:53:12 +01:00
Andreas Kling
766567a9d5 LibJS: Compare Rust and C++ bytecode for lazily compiled functions
LIBJS_COMPARE_PIPELINES previously only compared top-level
script/eval/module bytecodes. Function bodies are compiled lazily
via compile_function(), and that path had no comparison at all.

Fix this by pairing each Rust-compiled SharedFunctionInstanceData
with its C++ counterpart during top-level compilation. When a
function is later lazily compiled, compile_function() runs both
pipelines and compares the bytecodes (crashing on mismatch, same
as the top-level comparisons). The pairing is done recursively so
nested functions are also covered.
2026-03-01 21:20:54 +01:00
Shannon Booth
502ae99102 LibJS: Make more use of Value::is and Value::as_if 2026-02-28 10:24:37 -05:00
Shannon Booth
4c8723e2d8 LibJS: Implement convenience helper Value::as<T>()
For simplifying code that has an is<T> assertion followed by a
cast, analagous to AK::as<T>.
2026-02-28 10:24:37 -05:00