Commit Graph

46 Commits

Author SHA1 Message Date
InvalidUsernameException
133bbeb4ec LibJS: Copy LHS of binary expression to preserve evaluation order
This error was found by asking an LLM to generate additional, related
test cases for the bug affecting https://volkswagen.de fixed in an
earlier commit.

An unconditional call to `copy_if_needed_to_preserve_evaluation_order`
in this place was showing up quiet significantly in the JS benchmarks.
To avoid the regression, there is now a small heuristic that avoids the
unnecessary Mov instruction in the vast majority of cases. This is
likely not the best way to deal with this. But the changes in the
current patch set are focussed on correctness, not performance. So I
opted for a localized, minimal-impact solution to the performance
regression.
2026-03-08 15:01:07 +01:00
InvalidUsernameException
4cd1fc8019 LibJS: Copy LHS of compound assignment to preserve evaluation order
This error was found by asking an LLM to generate additional, related
test cases for the bug affecting https://volkswagen.de fixed in an
earlier commit.
2026-03-08 15:01:07 +01:00
InvalidUsernameException
ced435987c LibJS: Copy keys in object expression to preserve evaluation order
This error was found by asking an LLM to generate additional, related
test cases for the bug affecting https://volkswagen.de fixed in an
earlier commit.
2026-03-08 15:01:07 +01:00
InvalidUsernameException
bb762fb43b LibJS: Do not assume arguments cannot be clobbered
`copy_if_needed_to_preserve_evaluation_order` was introduced in
c372a084a2. At that point function
arguments still needed to be copied into registers with a special
`GetArgument` instructions. Later, in
3f04d18ef7 this was changed and arguments
were made their own operand type that can be accessed directly instead.

Similar to locals, arguments can also be overwritten due to evaluation
order in various scenarios. However, the function was never updated to
account for that. Rectify that here.

With this change, https://volkswagen.de no longer gets blanked shortly
after initial load and the unhandled JS exception spam on that site is
gone too.
2026-03-08 15:01:07 +01:00
InvalidUsernameException
34b7cb6e55 LibJS: Explicitly handle all operand types when determining clobbering
The last time a new operand type was added, the effects from that on the
function changed in this commit were seemingly not properly considered,
introducing a bug. To avoid such errors in the future, rewrite the code
to produce a compile-time error if new operand types are added.

No functional changes yet, the actual bugfix will be in a
followup-commit.
2026-03-08 15:01:07 +01:00
Andreas Kling
54a1a66112 LibJS: Store cache pointers directly in bytecode instructions
Instead of storing a u32 index into a cache vector and looking up the
cache at runtime through a chain of dependent loads (load Executable*,
load vector data pointer, multiply index, add), store the actual cache
pointer as a u64 directly in the instruction stream.

A fixup pass (Executable::fixup_cache_pointers()) runs after Executable
construction in both the Rust and C++ pipelines, walking the bytecode
and replacing each index with the corresponding pointer.

The cache pointer type is encoded in Bytecode.def (e.g.
PropertyLookupCache*, GlobalVariableCache*) so the fixup switch is
auto-generated by the Python Op code generator, making it impossible
to forget updating the fixup when adding new cached instructions.

This eliminates 3-4 dependent loads on every inline cache access in
both the C++ interpreter and the assembly interpreter.
2026-03-08 10:27:13 +01:00
Andreas Kling
d2760d09b7 LibJS: Extract BytecodeDef into a shared Rust crate
Move the Bytecode.def parser, field type info, and layout computation
out of Rust/build.rs into a standalone BytecodeDef crate. This allows
both the Rust bytecode codegen (build.rs) and the upcoming AsmIntGen
tool to share a single source of truth for instruction field offsets
and sizes.

The AsmIntGen directory is excluded from the workspace since it has
its own Cargo.toml and is built separately by CMake.
2026-03-07 13:09:59 +01:00
Andreas Kling
c31c52b0a9 LibJS: Unify parser and scope collector error types
Replace three identical error structs (ParserError, ScopeError,
ParsedError) with a single shared ParseError type. Since all three
had the same fields (message, line, column), having separate types
only added verbose field-by-field copying at each boundary.

Now errors flow directly from parser/scope collector into
ParsedProgram without conversion.
2026-03-06 13:06:05 +01:00
Andreas Kling
7d45e897c4 LibJS: Add rust_compile_parsed_module() for pre-parsed modules
Add rust_compile_parsed_module() which takes a ParsedProgram (from
rust_parse_program with type=Module) and compiles it with GC
interaction. This extracts import/export metadata, compiles the
module body to bytecode, and extracts declaration data.

Rewrite rust_compile_module() to delegate to rust_parse_program()
followed by rust_compile_parsed_module() internally, matching the
rust_compile_script() pattern.
2026-03-06 13:06:05 +01:00
Andreas Kling
2082e063aa LibJS: Split rust_compile_script() into parse and compile steps
Add a ParsedProgram struct that holds the parsed AST, function table,
scope data, and strictness flag without any GC references. This
enables future off-thread parsing since the parse step makes zero
GC allocations.

The type is called ParsedProgram (not ParsedScript) because it will
be used for both scripts and modules. It takes a program_type
parameter (0 = Script, 1 = Module) to handle both cases.

New FFI functions:
- rust_parse_program(): lex, parse, scope analysis (no VM/GC needed)
- rust_compile_parsed_script(): codegen + GDI extraction (needs VM)
- rust_parsed_program_has_errors(): check for parse errors
- rust_parsed_program_take_errors(): report errors via callback
- rust_parsed_program_ast_dump(): lazily generate AST dump string
- rust_free_parsed_program(): free without compiling

Rewrite rust_compile_script() to call rust_parse_program() followed
by rust_compile_parsed_script() internally, preserving the existing
behavior and API.
2026-03-06 13:06:05 +01:00
Andreas Kling
b2b72a1884 LibJS: Defer regex literal compilation to post-parse step
Move regex compilation out of the parsing hot path. Both the C++ and
Rust parsers now collect raw regex pattern+flags strings during parsing
and batch-compile them after parsing completes.

This is a prerequisite for moving the Rust parser to a background
thread, since LibRegex is thread-unsafe and FFI calls during parsing
prevent parallelization.

Flag validation remains in the parser since it's trivial string
checking with no LibRegex dependency.
2026-03-06 13:06:05 +01:00
Andreas Kling
ac35ef465b LibJS: Emit ThrowIfTDZ before simple assignment to let variables
The Rust bytecode codegen was missing a TDZ check before assigning to
local let/const variables in simple assignment expressions (a = expr).
The C++ pipeline correctly emits ThrowIfTDZ before the store to ensure
temporal dead zone semantics are enforced.

Add an emit_tdz_check_if_needed helper matching the C++ equivalent,
and call it in the simple assignment path.
2026-03-04 18:53:12 +01:00
Andreas Kling
56e09695e0 LibJS: Consolidate Put bytecode instructions and reduce code bloat
Replace 20 separate Put instructions (5 PutKinds x 4 forms) with
4 unified instructions (PutById, PutByIdWithThis, PutByValue,
PutByValueWithThis), each carrying a PutKind field at runtime instead
of being a separate opcode.

This reduces the number of handler entry points in the dispatch loop
and eliminates template instantiations of put_by_property_key and
put_by_value that were being duplicated 5x each when inlined by LTO.
2026-03-04 18:53:12 +01:00
Andreas Kling
fb61294df7 LibJS: Add UsingDeclaration to needs_block_declaration_instantiation
Blocks containing non-local using declarations need a lexical
environment, just like let/const declarations. Add the missing
UsingDeclaration case to match C++ behavior.
2026-03-04 12:17:59 +01:00
Andreas Kling
bd7fc2b1b1 LibJS: Fix ResolveThisBinding/ResolveSuperBase emission order
Emit ResolveThisBinding before ResolveSuperBase in both
emit_evaluate_member_reference and emit_store_to_reference, matching
the C++ pipeline's evaluation order for super property references.

Also restructure emit_evaluate_member_reference to move non-super base
evaluation into the else branch, since the super path now handles
base evaluation differently (explicit ResolveSuperBase instead of
going through generate_expression on Super).
2026-03-04 12:17:59 +01:00
Andreas Kling
4120765497 LibJS: Keep arg_holders alive in generate_arguments_array
Keep the arg_holders vector alive through the spread arguments loop,
matching the C++ pipeline where the args Vector keeps registers held
through the loop. This ensures consistent register allocation.
2026-03-04 12:17:59 +01:00
Andreas Kling
7ceba6d2cb LibJS: Fix register order in private logical assignment
Move the destination register allocation after RHS evaluation in
private identifier logical assignment (&&=, ||=, ??=), matching the
C++ pipeline's register allocation order.
2026-03-04 12:17:59 +01:00
Andreas Kling
fa72fd9f95 LibJS: Optimize constant string computed properties to MemberId
When a computed member expression uses a constant string (e.g.
super["minutes"] or obj["key"]), optimize it to use the MemberId or
SuperMemberId reference form instead of the value-based form, matching
the C++ pipeline optimization.
2026-03-04 12:17:59 +01:00
Andreas Kling
33a6b90ccf LibJS: Clear pending_lhs_name for named class expressions
Named class expressions don't use pending_lhs_name, but we must still
clear it to prevent it from leaking through to nested anonymous
functions inside the class body.
2026-03-04 12:17:59 +01:00
Andreas Kling
ce4767f744 LibJS: Only set pending_lhs_name for non-empty class field names
For computed class fields, field_name is empty and the name is set at
runtime. Avoid setting pending_lhs_name in that case, which prevents
the name from leaking into computed field initializers.
2026-03-04 12:17:59 +01:00
Andreas Kling
722a897b28 LibJS: Remove redundant ThrowIfTDZ from Rust emit_set_variable
The caller is responsible for emitting ThrowIfTDZ before calling
emit_set_variable(), matching the C++ pipeline behavior. Remove the
redundant TDZ checks from both the const and non-const local paths.
2026-03-04 12:17:59 +01:00
slim
9ee2bb5570 LibJS: Refactor function token is identifier to make it more readable 2026-03-02 12:04:02 +01:00
Andreas Kling
fdcb2cdd94 LibJS/Rust: Create for-of loop blocks before evaluating iterable
Match the C++ pipeline's block creation order by creating end_block
and update_block before evaluating the for-of RHS expression. This
ensures loop structure blocks get lower block numbers than blocks
created during RHS evaluation (e.g. from conditional expressions),
producing the same block layout as C++.
2026-03-01 21:20:54 +01:00
Andreas Kling
496ca319e4 LibJS/Rust: Skip TDZ check for self-move in emit_set_variable
Move the self-move detection before the TDZ check in
emit_set_variable, matching C++ which returns early for self-moves
without emitting anything. A self-move only happens in compound and
logical assignments where the LHS is a local variable, and the LHS
was already read with a TDZ check, making the write-back's TDZ check
redundant.
2026-03-01 21:20:54 +01:00
Andreas Kling
c379587adf LibJS/Rust: Don't pass preferred_dst for destructuring assignment RHS
When the LHS of an assignment is a destructuring pattern, don't pass
preferred_dst when generating the RHS expression. This matches the C++
pipeline which always allocates a fresh register for the RHS value.

The difference was visible when a destructuring assignment appeared
inside a logical AND expression: the C++ pipeline would allocate the
RHS into a fresh register and then copy it to the AND result register,
while the Rust pipeline would evaluate the RHS directly into the AND
result register, omitting the copy.
2026-03-01 21:20:54 +01:00
Andreas Kling
7602e030e7 LibJS/Rust: Push LeaveLexicalEnvironment boundary in for-of loops
When a for-of loop creates a per-iteration lexical environment for
let/const declarations, push a LeaveLexicalEnvironment boundary so
that continue/break/return properly restores the lexical environment.
The for-in codegen already did this but for-of was missing it.
2026-03-01 21:20:54 +01:00
Andreas Kling
75e628ce9b LibJS/Rust: Push LeaveLexicalEnvironment boundary in switch statements
When a switch statement creates a lexical environment for block-scoped
declarations (let/const), push a LeaveLexicalEnvironment boundary so
that perform_needed_unwinds correctly emits SetLexicalEnvironment
before Return/Throw instructions inside the switch body.
2026-03-01 21:20:54 +01:00
Andreas Kling
93d19b35b6 LibJS/Rust: Handle private member update expressions
Add PrivateIdentifier handling in generate_update_expression so that
postfix/prefix increment/decrement on private members (e.g. this.#c++)
correctly emits GetPrivateById, PostfixIncrement, and PutPrivateById.
Previously this fell through to an empty fallback that returned an
uninitialized register.
2026-03-01 21:20:54 +01:00
Andreas Kling
fa78df7ab7 LibJS/Rust: Call perform_needed_unwinds before Throw instructions
Add missing perform_needed_unwinds() calls before Throw instructions
in four places:
- Await continuation throw path
- Yield* throw_value_block
- Yield* iterator missing throw method
- Invalid left-hand side in assignment helper

This matches the C++ pipeline which calls perform_needed_unwinds<Throw>
before every Throw to restore lexical environments when throwing out of
scopes like with statements.
2026-03-01 21:20:54 +01:00
Andreas Kling
4f503ef243 LibJS/Rust: Use correct capacity for CreateVariableEnvironment
Pass the fully computed var_environment_bindings_count from the SFD
metadata to the codegen, instead of using the raw
non_local_var_count_for_parameter_expressions. The full count includes
additional bindings from Annex B function hoisting and strict-mode
lexical declarations that share the var environment.
2026-03-01 21:20:54 +01:00
Andreas Kling
e88932f75d LibJS/Rust: Always allocate fresh register for postfix update result
Match the C++ pipeline behavior where PostfixIncrement/PostfixDecrement
always writes to a freshly allocated register. The Rust pipeline was
using the caller's preferred_dst, producing one fewer Mov instruction
but causing bytecode mismatches.
2026-03-01 21:20:54 +01:00
Andreas Kling
1c40257c7f LibJS/Rust: Restore unwind handler when trampolining through finally
When break/continue trampolines through nested finally blocks, we need
to restore the unwind handler to the level that was active before each
finally context was pushed. Without this, trampoline blocks created for
inner finally dispatch incorrectly inherited the innermost exception
handler, causing exception handler range mismatches with C++.

Store the current_unwind_handler in each FinallyContext at push time,
and restore it in emit_trampoline_through_finally when popping through.
2026-03-01 21:20:54 +01:00
Andreas Kling
8f38e526e7 LibJS/Rust: Fix several bytecode mismatches with the C++ pipeline
- Don't emit dead code after Throw for UsingDeclaration in for-of
  LHS assignment. Guard loop body generation with
  is_current_block_terminated() in both for-in and for-of.

- Add LeaveLexicalEnvironment boundary tracking to for-loop
  per-iteration environment management, so that perform_needed_unwinds
  correctly emits SetLexicalEnvironment before Throw instructions
  inside the loop body.
2026-03-01 21:20:54 +01:00
Andreas Kling
d0b9905de1 LibJS/Rust: Use GetLengthWithThis for super.length property access
The C++ pipeline has an optimization that uses the GetLengthWithThis
instruction instead of GetByIdWithThis when accessing the "length"
property. Add the same optimization to the Rust pipeline by
introducing an emit_get_by_id_with_this helper that checks for the
"length" property name and emits the optimized instruction.

Also update emit_get_by_value_with_this to use GetLengthWithThis
when the computed property is a constant "length" string.
2026-03-01 21:20:54 +01:00
Andreas Kling
c365806041 LibJS/Rust: Fix evaluation order for super in tagged templates
Per spec, computed property key expressions should be evaluated
before calling ResolveSuperBase. Fix the Rust codegen for tagged
template literals with super member expressions to match the C++
pipeline's correct evaluation order.
2026-03-01 21:20:54 +01:00
Andreas Kling
56603319b4 LibJS/Rust: Fix evaluation order in delete super[key]
Per spec, the property key expression should be evaluated before
calling ResolveSuperBase. Fix the Rust codegen to match the C++
pipeline's correct evaluation order.
2026-03-01 21:20:54 +01:00
Andreas Kling
543bd7059a LibJS/Rust: Don't emit dead code after Throw for invalid LHS
Match the C++ pipeline behavior by not creating a dead basic block
after emitting a Throw instruction in emit_invalid_lhs_error.
2026-03-01 21:20:54 +01:00
Andreas Kling
18c40a1328 LibJS/Rust: Fix has_parameter_expressions and TDZ checks for arguments
Fix two bugs in the Rust bytecode codegen:

1. has_parameter_expressions incorrectly treated any destructuring
   parameter as a "parameter expression", when it should only do so
   for patterns that contain expressions (defaults or computed keys).
   This caused an unnecessary CreateLexicalEnvironment for simple
   destructuring like `function f({a, b}) {}`. The same bug existed
   in both codegen.rs and lib.rs (SFD metadata computation).

2. emit_set_variable used is_local_lexically_declared(index) for
   argument locals, but that function indexes into the local_variables
   array using the argument's index, checking the wrong variable.
   This caused spurious ThrowIfTDZ instructions when assigning to
   function arguments that happened to share an index with an
   uninitialized let/const variable.
2026-03-01 21:20:54 +01:00
Andreas Kling
00ffc340bc LibJS: Wrap CompiledRegex in Rc to allow AST cloning
CompiledRegex held an FFI handle with unique ownership and panicked
on clone. This caused a crash when a class field initializer contained
a regex literal, since the codegen wraps field initializers in a
synthetic function body by cloning the expression.

Wrapping CompiledRegex in Rc makes the clone a cheap refcount bump.
The take() semantics are preserved: the first codegen path to call
take() gets the handle, and Drop frees it if nobody took it.
2026-02-25 21:54:30 +01:00
Andreas Kling
f19d00ca9e LibJS: Memoize failed arrow function attempts in Rust parser
Cache failed arrow function attempts by token offset. Once we
determine that '(' at offset N is not the start of an arrow
function, skip re-attempting at the same offset.

Without memoization, nested expressions like (a=(b=(c=(d=0))))
cause exponential work: each failed arrow attempt at an outer '('
re-parses all inner '(' positions during grouping expression
re-parse, and each inner position triggers its own arrow
attempts. With n nesting levels, the innermost position is
processed O(2^n) times.

The C++ parser already has this optimization (via the
try_parse_arrow_function_expression_failed_at_position()
memoization cache).
2026-02-24 18:42:13 +01:00
xnacly
48e906edfd Meta: Add 'cargo clippy -- -D clippy::all' to lint-ci.sh 2026-02-24 16:35:51 +01:00
xnacly
bbb6121df4 LibJs/Rust: Migrate to edition 2024 2026-02-24 16:35:51 +01:00
xnacly
e897b77e83 LibJS/Rust: Cargo fmt on all source files 2026-02-24 16:35:51 +01:00
xnacly
809f81704c LibJS/Rust: Clean build script up
Small changes but many of them:

- all codegen now directly writes into the target file instead of
  creating intermediate Strings via the Write trait
- all unwraps are now a combination of Results and ?
- field_type_info now returns a structure instead of a tuple.
- rebuilding now no longer appends the same code again, but truncates
  before codegen
2026-02-24 16:35:51 +01:00
Andreas Kling
81cb230526 Flatpak: Add Rust toolchain and vendored cargo dependencies
Add the rust-stable SDK extension, pre-download Corrosion and all crate
dependencies, and set up cargo vendoring so the Flatpak build can
compile Rust code without network access during the build phase.
2026-02-24 09:39:42 +01:00
Andreas Kling
6cdfbd01a6 LibJS: Add alternative source-to-bytecode pipeline in Rust
Implement a complete Rust reimplementation of the LibJS frontend:
lexer, parser, AST, scope collector, and bytecode code generator.

The Rust pipeline is built via Corrosion (CMake-Cargo bridge) and
linked into LibJS as a static library. It is gated behind a build
flag (ENABLE_RUST, on by default except on Windows) and two runtime
environment variables:

- LIBJS_CPP: Use the C++ pipeline instead of Rust
- LIBJS_COMPARE_PIPELINES=1: Run both pipelines in lockstep,
  aborting on any difference in AST or bytecode generated.

The C++ side communicates with Rust through a C FFI layer
(RustIntegration.cpp/h) that passes source text to Rust and receives
a populated Executable back via a BytecodeFactory interface.
2026-02-24 09:39:42 +01:00