Commit Graph

177 Commits

Author SHA1 Message Date
InvalidUsernameException
133bbeb4ec LibJS: Copy LHS of binary expression to preserve evaluation order
This error was found by asking an LLM to generate additional, related
test cases for the bug affecting https://volkswagen.de fixed in an
earlier commit.

An unconditional call to `copy_if_needed_to_preserve_evaluation_order`
in this place was showing up quiet significantly in the JS benchmarks.
To avoid the regression, there is now a small heuristic that avoids the
unnecessary Mov instruction in the vast majority of cases. This is
likely not the best way to deal with this. But the changes in the
current patch set are focussed on correctness, not performance. So I
opted for a localized, minimal-impact solution to the performance
regression.
2026-03-08 15:01:07 +01:00
Andreas Kling
b2b72a1884 LibJS: Defer regex literal compilation to post-parse step
Move regex compilation out of the parsing hot path. Both the C++ and
Rust parsers now collect raw regex pattern+flags strings during parsing
and batch-compile them after parsing completes.

This is a prerequisite for moving the Rust parser to a background
thread, since LibRegex is thread-unsafe and FFI calls during parsing
prevent parallelization.

Flag validation remains in the parser since it's trivial string
checking with no LibRegex dependency.
2026-03-06 13:06:05 +01:00
Andreas Kling
7f0e59396f LibJS: Add dump_to_string() for AST nodes and bytecode executables
Add the ability to dump AST and bytecode to a String instead of only
to stdout/stderr. This is done by adding an optional StringBuilder
output sink to ASTDumpState, and a new dump_to_string() method on
both ASTNode and Bytecode::Executable.

These will be used for comparing output between compilation pipelines.
2026-02-24 09:39:42 +01:00
Andreas Kling
7281091fdb LibJS: Make bytecode generation infallible
Remove CodeGenerationError and make all bytecode generation functions
return their results directly instead of wrapping them in
CodeGenerationErrorOr.

For the few remaining sites where codegen encounters an unimplemented
or unexpected AST node, we now use a new emit_todo() helper that emits
a NewTypeError + Throw sequence at compile time (preserving the runtime
behavior) and then switches to a dead basic block so subsequent codegen
for the same function can continue without issue.

This allows us to remove error handling from all callers of the
bytecode compiler, simplifying the code significantly.
2026-02-12 11:37:43 +01:00
Andreas Kling
2fd75d948b LibJS: Remove unused Program::global_declaration_instantiation() 2026-02-11 23:57:41 +01:00
Andreas Kling
35674df48a LibJS: Handle BigInt literal keys in class field initializer naming
When a class field has a BigInt literal key like `128n = class {}`,
the anonymous class should get the name "128". The codegen path
handles Identifier, StringLiteral, and NumericLiteral keys but was
missing BigInt keys, causing the name to be empty.

Parse the BigInt literal value at codegen time and convert it to a
decimal string for both the field_name (anonymous function naming)
and class_field_initializer_name (eval("arguments") checking) paths.
2026-02-11 23:57:41 +01:00
Andreas Kling
e308e73120 LibJS: Move SharedFunctionInstanceData creation out of FunctionNode
Add static factory methods create_for_function_node() on
SharedFunctionInstanceData and update all callers to use them instead
of FunctionNode::ensure_shared_data().

This removes the GC::Root<SharedFunctionInstanceData> cache from
FunctionNode, eliminating the coupling between the RefCounted AST
and GC-managed runtime objects. The cache was effectively dead code:
hoisted declarations use m_functions_to_initialize directly, and
function expressions always create fresh instances during codegen.
2026-02-11 23:57:41 +01:00
Andreas Kling
6082de6487 LibJS: Make ImportEntry and ExportEntry own their ModuleRequest
Change ImportEntry and ExportEntry to store Optional<ModuleRequest>
by value instead of raw pointers into AST storage. This decouples the
entry records from AST node lifetimes, preparing for dropping the AST
from SourceTextModule after first compilation.
2026-02-11 23:57:41 +01:00
Andreas Kling
4c7a349b62 LibJS: Remove #include <AST.h> from SharedFunctionInstanceData.h
Extract FunctionParsingInsights into its own header and introduce
FunctionLocal as a standalone mirror of Identifier::Local. This
allows SharedFunctionInstanceData.h to avoid pulling in the full
AST type hierarchy, reducing transitive include bloat.

The AST.h include is kept in SharedFunctionInstanceData.cpp where
it's needed for the constructor that accesses AST node types.
2026-02-11 23:57:41 +01:00
Andreas Kling
0e0818a232 LibJS: Remove dead AST class evaluation code
Now that class construction is driven by ClassBlueprint, remove the
old AST-based class element evaluation and class constructor creation
code:

- ClassExpression::create_class_constructor()
- ClassMethod::class_element_evaluation()
- ClassField::class_element_evaluation()
- StaticInitializer::class_element_evaluation()
- ClassElement::ClassValue typedef
- update_function_name() and class_key_to_property_name() helpers

Also remove includes that are no longer needed.
2026-02-11 23:57:41 +01:00
Andreas Kling
6decb93dd7 LibJS: Populate ClassBlueprint during codegen
Build a ClassBlueprint from ClassExpression elements at codegen time:

- Methods/getters/setters: register SharedFunctionInstanceData from
  the method's FunctionExpression
- Field initializers with literal values (numbers, booleans, null,
  strings, negated numbers): store the value directly, avoiding
  function creation entirely
- Field initializers with non-literal values: wrap in
  ClassFieldInitializerStatement and create SharedFunctionInstanceData
- Static initializers: create SharedFunctionInstanceData from the
  function body
- Constructor: register SharedFunctionInstanceData from the
  constructor's FunctionExpression

Add public accessors to ClassMethod::function() and
StaticInitializer::function_body() for codegen access.

The blueprint is registered but not yet used by NewClass (dual path).

No behavioral change.
2026-02-11 23:57:41 +01:00
Andreas Kling
1cb7a528c5 LibJS: Give rest-only parameters their argument index
When a function accesses the arguments object in non-strict mode, scope
analysis was skipping argument index assignment for all parameter
candidates. This is correct for regular parameters (which participate in
the sloppy-mode arguments-parameter linkage), but rest parameters never
participate in that linkage and should always get their argument index.
2026-02-10 02:05:20 +01:00
Andreas Kling
a8a1aba3ba LibJS: Replace ScopePusher with ScopeCollector
Replace the ScopePusher RAII class (which performed scope analysis
in its destructor chain during parsing) with a two-phase approach:

1. ScopeCollector builds a tree of ScopeRecord nodes during parsing
   via RAII ScopeHandle objects. It records declarations, identifier
   references, and flags, but does not resolve anything.

2. After parsing completes, ScopeCollector::analyze() walks the tree
   bottom-up and performs all resolution: propagate eval/with
   poisoning, resolve identifiers to locals/globals/arguments, hoist
   functions (Annex B.3.3), and build FunctionScopeData.

Key design decisions:
- ScopeRecord::ast_node is a RefPtr<ScopeNode> to prevent
  use-after-free when synthesize_binding_pattern re-parses an
  expression as a binding pattern (the original parse's scope records
  survive with stale AST node pointers).
- Parser::scope_collector() returns the override collector if set
  (for synthesize_binding_pattern's nested parser), ensuring all
  scope operations route to the outer parser's scope tree.
- FunctionNode::local_variables_names() delegates to its body's
  ScopeNode rather than copying at parse time, since analysis runs
  after parsing.
2026-02-10 02:05:20 +01:00
Andreas Kling
52ddc15fb3 LibJS: Redesign AST dump with unicode tree drawing
Replace the old indentation-based AST dump with a new tree-drawing
approach using unicode box characters. Each node now also shows its
source position as @line:column, and additional internal state:

- Identifier: [argument:N] vs [variable:N], declaration kind
  (var/let/const), [global], [in-eval-scope]
- FunctionNode: [strict], [arrow], [direct-eval], [uses-this],
  [uses-this-from-environment], [might-need-arguments]
- Program: (script)/(module), [strict], [top-level-await]
- YieldExpression: [yield*] for delegation

Dump code is moved from AST.cpp into a new ASTDump.cpp file.
2026-02-10 02:05:20 +01:00
Andreas Kling
88d715fc68 LibJS: Eliminate HashMap operations in SFID by caching parser data
Cache necessary data during parsing to eliminate HashMap operations
in SharedFunctionInstanceData construction.

Before: 2 HashMap copies + N HashMap insertions with hash computations
After: Direct vector iteration with no hashing

Build FunctionScopeData for function scopes in the parser containing:
- functions_to_initialize: deduplicated var-scoped function decls
- vars_to_initialize: var decls with is_parameter/is_function_name
- var_names: HashTable for AnnexB extension checks
- Pre-computed counts for environment size calculation
- Flags for "arguments" handling

Add ScopeNode::ensure_function_scope_data() to compute the data
on-demand for edge cases that don't go through normal parser flow
(synthetic class constructors, static initializers, module wrappers).

Use this cached data directly in SFID with zero HashMap operations.
2026-01-25 23:08:36 +01:00
dosisod
ac8cc6d24b LibJS: Constant fold LogicalExpression
Logical expressions like `true || false` are now constant folded. This
also allows for dead code elimination if we know the right-hand side of
the expression will never be evaluated (such as `false && f()` or
`true || f()`).

In the test suites, the values are now being constant folded at compile
time. To ensure that the actual evaluation logic is being called
properly, I had to duplicate the tests and call them via a function so
the compiler would not optimize the evaluation logic away.

This also demotes `NaN` and `Infinity` identifiers to `nan` and
`inf` double literals, which will further help with const folding.
2026-01-22 08:47:18 +01:00
dosisod
5a8d71fb02 LibJS: Optimize double boolean not (!!) operation
This is a common way to convert a value to a boolean. Instead of doing
a boolean conversion and 2 negate operations, we replace this with a
single `ToBoolean` op code.
2026-01-22 08:45:42 +01:00
Luke Wilde
d766e41c94 LibJS: Store tagged template literal raw strings as StringLiterals 2026-01-06 23:25:36 +01:00
Andreas Kling
d6fbde43f8 LibJS: Track eval() scope membership per-identifier
The previous fix prevented eval() in sibling function scopes from
affecting each other, but it still had a limitation: when identifiers
from multiple scopes were merged into the same identifier group at
Program scope, the presence of eval() anywhere would taint all
identifiers in the group.

This change tracks per-identifier whether it was inside a scope with
eval() in the scope chain. When a scope closes, if it contains eval()
or has eval() in its parent chain, each identifier in that scope is
marked with `is_inside_scope_with_eval`. At Program scope finalization,
only identifiers that are NOT marked can be optimized to global lookups.

This allows code like:
```js
var x = undefined;  // Can be optimized (program scope)
(function() {
    function walk() { undefined; }  // Cannot be optimized
    eval('');
})();
```

Before: Neither `undefined` could be optimized
After: The program-scope `undefined` is optimized, while the one inside
       the function with eval() correctly uses dynamic lookup.
2026-01-06 00:11:28 +01:00
Andreas Kling
3ee80b23d3 LibJS: Store full realized SourceRange with each AST node
We were spending way too much time converting unrealized source ranges
into line/column pairs on real web content.

This improves JS parsing speed on x.com by 1.13x
2025-12-29 13:36:01 +01:00
Andreas Kling
3610a31279 LibJS: Pack members of the AST Identifier class better
A bit of creative structure packing brings this from 80 to 56 bytes.
This is hugely impactful on x.com where we have roughly ~2.3 million
Identifier objects after loading the home feed.

In other words, this reduces memory usage on that page by up to 55 MiB.

We should eventually discard most of the AST after parsing, but that
will require more architectural work so this is a nice stopgap
improvement before then.
2025-12-21 15:13:47 -06:00
Andreas Kling
63eccc5640 LibJS: Don't make extra copies of every JS function's source code
Instead, let functions have a view into the AST's SourceCode object's
underlying string data. The source string is kept alive by the AST, so
it's fine to have views into it as long as the AST exists.

Reduces memory footprint on my x.com home feed by 65 MiB.
2025-12-21 10:06:04 -06:00
Andreas Kling
ece0b72e3c LibJS: Don't set [[HomeObject]] for non-method object properties
This fixes an issue where we'd incorrectly retain objects via the
[[HomeObject]] slot. This common pattern was affected:

    Object.defineProperty(o, "foo", {
        get: function() { return 123; }
    });

Above, the object literal would get assigned to the [[HomeObject]]
slot even though "get" is not a "method" per the spec.

This frees about 30,000 objects on my x.com home feed.
2025-12-17 12:50:17 -06:00
Andreas Kling
9312a9f86f LibJS: Move InstantiateOrdinaryFunctionExpression into interpreter
This is execution time stuff and doesn't belong in the AST.
2025-10-27 21:14:33 +01:00
Andreas Kling
44fa9566a8 LibJS: Generate bytecode for the BlockDeclarationInstantiation AO
This necessitated adding some new instructions for creating mutable and
immutable bindings.
2025-10-27 21:14:33 +01:00
Andreas Kling
b712caf855 LibJS: Move bytecode executable cache to SharedFunctionInstanceData
This shrinks every Statement and ECMAScriptFunctionObject by one
pointer, and puts the bytecode cache in the only place that actually
makes use of it anyway: functions.
2025-10-27 21:14:33 +01:00
Andreas Kling
3a38040c82 LibJS: Make SharedFunctionInstanceData GC-allocated 2025-10-27 21:14:33 +01:00
ayeteadoe
6dbb59da77 LibJS: Export symbols causing linker errors in various consumers
After LibJS had its symbol exports optimized the targets
js, test-js, test262-runner, test-wasm, and LibWeb began to get linker
errors after the work to add Windows support for test-web and ladybird
targets. These extra JS_API annotations fix all those linker errors.
2025-08-23 16:04:36 -06:00
Timothy Flynn
cf61171864 LibJS: Port remaining bytecode identifiers to UTF-16 2025-08-14 10:27:08 +02:00
Timothy Flynn
62d85dd90a LibJS: Port RegExp flags and patterns to UTF-16 2025-08-13 09:56:13 -04:00
Timothy Flynn
b955c9b2a9 LibJS: Port the Identifier AST (and related) nodes to UTF-16
This eliminates quite a lot of UTF-8 / UTF-16 churn.
2025-08-13 09:56:13 -04:00
Timothy Flynn
0efa98a57a LibJS+LibWeb+WebContent: Port JS::PropertyKey to UTF-16
This has quite a lot of fall out. But the majority of it is just type or
UDL substitution, where the changes just fall through to other function
calls.

By changing property key storage to UTF-16, the main affected areas are:
* NativeFunction names must now be UTF-16
* Bytecode identifiers must now be UTF-16
* Module/binding names must now be UTF-16
2025-08-05 07:07:15 -04:00
ayeteadoe
2e2484257d LibJS: Enable EXPLICIT_SYMBOL_EXPORT and annotate minimum symbol set 2025-07-22 11:51:29 -04:00
ayeteadoe
539a675802 LibJS: Revert Enable EXPLICIT_SYMBOL_EXPORT
This reverts commit c14173f651. We
should only annotate the minimum number of symbols that external
consumers actually use, so I am starting from scratch to do that
2025-07-22 11:51:29 -04:00
ayeteadoe
c14173f651 LibJS: Enable EXPLICIT_SYMBOL_EXPORT 2025-06-30 10:50:36 -06:00
Daniel Bertalan
456d750539 LibJS: Make generate_labelled_evaluation non-virtual if possible
We don't override anything with definitions of this function in
`SwitchStatement` and `LabelledStatement`. Also, we can make the
`IterationStatement` abstract, there is no need to add a fallback
error-generating stub implementation of this method.
2025-05-12 11:40:45 -06:00
Aliaksandr Kalenik
db480b1f0c LibJS: Preserve information about local variables declaration kind
This is required for upcoming change where we want to emit ThrowIfTDZ
for assignment expressions only for lexical declarations.
2025-05-06 12:06:23 +02:00
Andreas Kling
bf1b754e91 LibJS: Optimize reading known-to-be-initialized var bindings
`var` bindings are never in the temporal dead zone (TDZ), and so we
know accessing them will not throw.

We now take advantage of this by having a specialized environment
binding value getter that doesn't check for exceptional cases.

1.08x speedup on JetStream.
2025-05-04 02:31:18 +02:00
Aliaksandr Kalenik
2d732b2251 LibJS: Skip allocating locals for arguments that allowed to be local
This allows us to get rid of instructions that move arguments to locals
and allocate smaller JS::Value vector in ExecutionContext by reusing
slots that were already allocated for arguments.

With this change for following function:
```js
function f(x, y) {
    return x + y;
}
```

we now produce following bytecode:
```
[   0]    0: Add dst:reg6, lhs:arg0, rhs:arg1
[  10]       Return value:reg6
```

instead of:
```
[   0]    0: GetArgument 0, dst:x~1
[  10]       GetArgument 1, dst:y~0
[  20]       Add dst:reg6, lhs:x~1, rhs:y~0
[  30]       Return value:reg6
```
2025-04-26 11:02:29 +02:00
Aliaksandr Kalenik
981e465a04 LibJS: Delete create_variable param in BindingPattern::generate_bytecode
It's no longer used, because we assume that caller of this function has
already taken care of variable creation and initialization.
2025-04-22 21:57:25 +02:00
Aliaksandr Kalenik
0f14c70252 LibJS: Use Identifier to represent CatchClause parameter names
By doing that we consistently use Identifier node for identifiers and
also enable mechanism that registers identifiers in a corresponding
ScopePusher for catch parameters, which is necessary for work in the
upcoming changes.
2025-04-22 21:57:25 +02:00
Aliaksandr Kalenik
20c6c4e359 LibJS: Delete unused member in FunctionParameter AST node 2025-04-22 10:53:59 +02:00
Andreas Kling
84626c7db2 LibJS: Add a bunch of fast_is<T> helpers for commonly checked types
Based on what was hitting dynamic_cast<T> on Speedometer.
2025-04-18 14:45:56 +02:00
Andreas Kling
2a9b6f1d97 LibJS: Move computation out of the ECMAScriptFunctionObject constructor
We were doing way too much computation every time an ESFO was
instantiated. This was particularly sad, since the results of these
computations were identical every time!

This patch adds a new SharedFunctionInstanceData object that gets
shared between all instances of an ESFO instantiated from some kind of
AST FunctionNode.

~5% speedup on Speedometer 2.1 :^)
2025-04-08 18:52:35 +02:00
R-Goc
28d5d982ce Everywhere: Remove unused private fields
This commit removes the -Wno-unusued-private-field flag, thus
reenabling the warning. Unused field were either removed or marked
[[maybe_unused]] when unsure.
2025-04-04 12:40:07 +02:00
Andreas Kling
6bb0d585e3 LibJS: Elide function wrapper for class field literal initializers
If a class field initializer is just a simple literal, we can skip
creating (and calling) a wrapper function for it entirely.

1.44x speedup on JetStream3/raytrace-private-class-fields.js
1.53x speedup on JetStream3/raytrace-public-class-fields.js
2025-04-01 23:55:20 +02:00
Andreas Kling
6c70dc5f09 LibJS: Create FunctionParameters earlier in the parser
This avoids making multiple copies of the Vector<FunctionParameter> in
the parser.
2025-03-27 19:50:13 +00:00
Andreas Kling
7477002e46 LibJS: Keep parsed function parameters in a shared data structure
Instead of making a copy of the Vector<FunctionParameter> from the AST
every time we instantiate an ECMAScriptFunctionObject, we now keep the
parameters in a ref-counted FunctionParameters object.

This reduces memory usage, and also allows us to cache the bytecode
executables for default parameter expressions without recompiling them
for every instantiation. :^)
2025-03-27 15:00:43 +00:00
Andreas Kling
46a5710238 LibJS: Use FlyString in PropertyKey instead of DeprecatedFlyString
This required dealing with *substantial* fallout.
2025-03-24 22:27:17 +00:00
Timothy Flynn
27478ec7d4 Everywhere: Run clang-format
The following command was used to clang-format these files:

    clang-format-19 -i $(find . \
        -not \( -path "./\.*" -prune \) \
        -not \( -path "./Build/*" -prune \) \
        -not \( -path "./Toolchain/*" -prune \) \
        -type f -name "*.cpp" -o -name "*.mm" -o -name "*.h")
2024-12-28 05:39:32 -08:00