ladybird

mirror of https://github.com/LadybirdBrowser/ladybird synced 2026-04-28 02:27:19 +02:00

Author	SHA1	Message	Date
Timothy Flynn	cb6a4683ce	LibJS: Return UTF16String's ASCII storage when applicable If we have a UTF-16 string with ASCII storage, we don't need to convert the string to UTF-8 to get a StringView. This fast-path is hit over 76k times during test-js and over 31 million times during test262.	2026-02-26 00:02:17 +01:00
Andreas Kling	00ffc340bc	LibJS: Wrap CompiledRegex in Rc to allow AST cloning CompiledRegex held an FFI handle with unique ownership and panicked on clone. This caused a crash when a class field initializer contained a regex literal, since the codegen wraps field initializers in a synthetic function body by cloning the expression. Wrapping CompiledRegex in Rc makes the clone a cheap refcount bump. The take() semantics are preserved: the first codegen path to call take() gets the handle, and Drop frees it if nobody took it.	2026-02-25 21:54:30 +01:00
Andreas Kling	4b47245e99	LibJS: Cache ASCII-to-UTF-16 source conversion for Rust compilation The Rust FFI requires UTF-16 source data, so ASCII-stored source code must be widened to UTF-16. Previously, this conversion was done into a temporary buffer on every call to compile_function, meaning the entire source file was converted for each lazily-compiled function. For large modules with many functions, this caused heavy spinning. Move the conversion into SourceCode::utf16_data() which lazily converts and caches the result once per source file. Subsequent compilations of functions from the same file reuse the cached data.	2026-02-25 00:00:52 +01:00
Andreas Kling	a7d1c4baa6	LibJS: Defer GC during Rust-pipeline module and builtin compilation The Rust bytecode pipeline stores SharedFunctionInstanceData pointers as raw void pointers invisible to the GC. If garbage collection runs during compilation (triggered by heap allocation of a new SFD), it can collect previously created SFDs, leaving stale pointers that crash during the next GC marking phase. Every other Rust compilation entry point (compile_script, compile_eval, compile_shadow_realm_eval, compile_dynamic_function, compile_function) already uses GC::DeferGC to prevent this. Add the missing DeferGC to compile_module and compile_builtin_file.	2026-02-25 00:00:52 +01:00
Andreas Kling	f19d00ca9e	LibJS: Memoize failed arrow function attempts in Rust parser Cache failed arrow function attempts by token offset. Once we determine that '(' at offset N is not the start of an arrow function, skip re-attempting at the same offset. Without memoization, nested expressions like (a=(b=(c=(d=0)))) cause exponential work: each failed arrow attempt at an outer '(' re-parses all inner '(' positions during grouping expression re-parse, and each inner position triggers its own arrow attempts. With n nesting levels, the innermost position is processed O(2^n) times. The C++ parser already has this optimization (via the try_parse_arrow_function_expression_failed_at_position() memoization cache).	2026-02-24 18:42:13 +01:00
xnacly	48e906edfd	Meta: Add 'cargo clippy -- -D clippy::all' to lint-ci.sh	2026-02-24 16:35:51 +01:00
xnacly	bbb6121df4	LibJs/Rust: Migrate to edition 2024	2026-02-24 16:35:51 +01:00
xnacly	e897b77e83	LibJS/Rust: Cargo fmt on all source files	2026-02-24 16:35:51 +01:00
xnacly	809f81704c	LibJS/Rust: Clean build script up Small changes but many of them: - all codegen now directly writes into the target file instead of creating intermediate Strings via the Write trait - all unwraps are now a combination of Results and ? - field_type_info now returns a structure instead of a tuple. - rebuilding now no longer appends the same code again, but truncates before codegen	2026-02-24 16:35:51 +01:00
Ali Mohammad Pur	3a5a6b741d	LibJS: Include AST.h in RustIntegration.h This was breaking the no-rust build with gcc.	2026-02-24 15:18:57 +01:00
Andreas Kling	81cb230526	Flatpak: Add Rust toolchain and vendored cargo dependencies Add the rust-stable SDK extension, pre-download Corrosion and all crate dependencies, and set up cargo vendoring so the Flatpak build can compile Rust code without network access during the build phase.	2026-02-24 09:39:42 +01:00
Andreas Kling	6cdfbd01a6	LibJS: Add alternative source-to-bytecode pipeline in Rust Implement a complete Rust reimplementation of the LibJS frontend: lexer, parser, AST, scope collector, and bytecode code generator. The Rust pipeline is built via Corrosion (CMake-Cargo bridge) and linked into LibJS as a static library. It is gated behind a build flag (ENABLE_RUST, on by default except on Windows) and two runtime environment variables: - LIBJS_CPP: Use the C++ pipeline instead of Rust - LIBJS_COMPARE_PIPELINES=1: Run both pipelines in lockstep, aborting on any difference in AST or bytecode generated. The C++ side communicates with Rust through a C FFI layer (RustIntegration.cpp/h) that passes source text to Rust and receives a populated Executable back via a BytecodeFactory interface.	2026-02-24 09:39:42 +01:00
Andreas Kling	8bf1d749a1	LibJS: Suppress global identifier optimization for dynamic functions Functions created via new Function() cannot assume that unresolved identifiers refer to global variables, since they may be called in an arbitrary scope. Pass a flag through the scope collector analysis to suppress the global identifier optimization in this case.	2026-02-24 09:39:42 +01:00
Andreas Kling	f84d54971b	LibJS: Restore ancestor scope flags after failed arrow parsing When the parser speculatively tries to parse an arrow function expression, encountering `this` inside a default parameter value like `(a = this)` propagates uses_this flags to ancestor function scopes via set_uses_this(). If the arrow attempt fails (no => follows), these flags were left behind, incorrectly marking ancestor scopes as using `this`. Fix this by saving and restoring the uses_this and uses_this_from_environment flags on all ancestor function scopes around speculative arrow function parsing.	2026-02-24 09:39:42 +01:00
Andreas Kling	8db93d0f2c	LibJS: Register using declarations in scope collector for for-loops The parser was not calling add_declaration() for using declarations in for-loop initializers, causing the scope collector to miss them. This meant identifiers declared with `using` in for-loops were incorrectly resolved as globals instead of local variables.	2026-02-24 09:39:42 +01:00
Andreas Kling	6f60a6ba72	LibJS: Defer scope registration of object property identifiers When parsing object expressions like {x: 42}, the parser was eagerly registering the identifier "x" in the scope collector even though it's only used as a property key and not a variable reference. Fix this by deferring the registration until we know we're dealing with a shorthand property ({x}), where the identifier is actually a variable reference that needs to be captured by the scope collector.	2026-02-24 09:39:42 +01:00
Andreas Kling	ad0bea3069	LibJS: Always consume exponent in decimal numeric literals The short-circuit evaluation in `is_invalid \|\| !consume_exponent()` skipped the consume_exponent() call when is_invalid was already true. This meant that for inputs like `1._1e2`, the exponent part would not be consumed and would instead be lexed as separate tokens.	2026-02-24 09:39:42 +01:00
Andreas Kling	7f0e59396f	LibJS: Add dump_to_string() for AST nodes and bytecode executables Add the ability to dump AST and bytecode to a String instead of only to stdout/stderr. This is done by adding an optional StringBuilder output sink to ASTDumpState, and a new dump_to_string() method on both ASTNode and Bytecode::Executable. These will be used for comparing output between compilation pipelines.	2026-02-24 09:39:42 +01:00
Andreas Kling	234203ed9b	LibJS: Ensure deterministic ordering in scope analysis and codegen The scope collector uses HashMaps for identifier groups and variables, which means their iteration order is non-deterministic. This causes local variable indices and function declaration instantiation (FDI) bytecode to vary between runs. Fix this by sorting identifier group keys alphabetically before assigning local variable indices, and sorting vars_to_initialize by name before emitting FDI bytecode. Also make register allocation deterministic by always picking the lowest-numbered free register instead of whichever one happens to be at the end of the free list. This is preparation for bringing in a new source->bytecode pipeline written in Rust. Checking for regressions is significantly easier if we can expect identical output from both pipelines.	2026-02-24 09:39:42 +01:00
Andreas Kling	a64e13d232	Meta: Add Rust toolchain as a build dependency There's now an ENABLE_RUST CMake option (on by default). Install Rust via rustup in devcontainer scripts, document the requirement in build instructions, and add Cargo's target/ directory to .gitignore.	2026-02-24 09:39:42 +01:00
Timothy Flynn	3355fb39ae	AK+LibJS: Replace home-grown Ryu implementation with fmt's dragonbox In the benchmark added here, fmt's dragonbox is ~3x faster than our own Ryu implementation (1197ms for dragonbox vs. 3435ms for Ryu). Daniel Lemire recently published an article about these algorithms: https://lemire.me/blog/2026/02/01/converting-floats-to-strings-quickly/ In this article, fmt's dragonbox implementation is actually one of the slower ones (with the caveat that some comments note that the article is a bit out-of-date). I've gone with fmt here because: 1. It has a readily available recent version on vcpkg. 2. It provides the methods we need to actually convert a floating point to decimal exponential form. 3. There is an ongoing effort to replace dragonbox with a new algorithm, zmij, which promises to be faster. 4. It is one of the only users of AK/UFixedBigInt, so we can potentially remove that as well soon. 5. Bringing in fmt opens the door to replacing a bunch of AK::format facilities with fmt as well.	2026-02-23 18:30:40 +01:00
Ben Wiederhake	c9dafdc51c	Meta: Reformat QtCreator file-listing script for readability	2026-02-23 13:10:03 +01:00
Timothy Flynn	451ac7d5d2	LibJS: Handle power-of-10 boundaries in Number.toExponential/toPrecision When the computed significand lands exactly on 10 ^ (precision - 1), the value sits right on a power-of-10 boundary where two representations are possible. For example, consider 1e-21. The nearest double value to 1e-21 is actually slightly less than 1e-21. So with `toPrecision(16)`, we must choose between: exponent=-21 significand=1000000000000000 -> 1.000000000000000e-21 exponent=-22 significand=9999999999999999 -> 9.999999999999999e-22 The spec dictates that we must pick the value that is closer to the true value. In this case, the second value is actually closer.	2026-02-22 09:39:10 -05:00
Timothy Flynn	a4d5b78fee	LibJS: Use exact integer arithmetic in Number.prototype.toExponential The arithmetic here nearly exactly matches that of toPrecision from commit `cf180bd4da`. The only difference is test262 contains tests for which our exponent estimate is off-by-1. We now handle this by detecting this inaccuracy. adjusting the exponent, and recomputing the significand.	2026-02-22 09:39:10 -05:00
Timothy Flynn	33a39c89c1	LibJS: Move NumberPrototype helper higher in the file This will be needed by Number.prototype.toExponential. It will also need some changes, so moving it up ahead of time will make that diff more practical to read.	2026-02-22 09:39:10 -05:00
Andreas Kling	d3a295a1e1	LibJS: Fix export default of parenthesized named class expression For `export default (class Name { })`, two things were wrong: The parser extracted the class expression's name as the export's local binding name instead of `default`. Per the spec, this is `export default AssignmentExpression ;` whose BoundNames is `default`, not the class name. The bytecode generator had a special case for ClassExpression that skipped emitting InitializeLexicalBinding for named classes. These two bugs compensated for each other (no crash, but wrong behavior). Fix both: always use `default` as the local binding name for expression exports, and always emit InitializeLexicalBinding for the `default` binding.	2026-02-21 19:27:03 +01:00
Jelle Raaijmakers	1745926fc6	AK+Everywhere: Use MurmurHash3 for int/u64 hashing Rework our hash functions a bit for significant better performance: * Rename int_hash to u32_hash to mirror u64_hash. * Make pair_int_hash call u64_hash instead of multiple u32_hash()es. * Implement MurmurHash3's fmix32 and fmix64 for u32_hash and u64_hash. On my machine, this speeds up u32_hash by 20%, u64_hash by ~290%, and pair_int_hash by ~260%. We lose the property that an input of 0 results in something that is not 0. I've experimented with an offset to both hash functions, but it resulted in a measurable performance degradation for u64_hash. If there's a good use case for 0 not to result in 0, we can always add in that offset as a countermeasure in the future.	2026-02-20 22:47:24 +01:00
Timothy Flynn	cf180bd4da	LibJS: Use exact integer arithmetic in Number.prototype.toPrecision Our previous implementation produced incorrect results for values near the limits of double precision. This new implementation avoid floating- point arithmetic entirely by: 1. Decomposing the double value into its exact binary form. 2. Computing the formulas from the spec using bigints. 3. Using Ryu to calculate the decimal exponent.	2026-02-20 13:40:40 -05:00
Andreas Kling	d4f222e442	LibJS: Don't reset switch case completion value for empty results When a statement in a switch case body doesn't produce a result (e.g. a variable declaration), we were incorrectly resetting the completion value to undefined. This caused the completion value of preceding expression statements to be lost.	2026-02-19 12:02:50 +01:00
Andreas Kling	7df998166c	LibJS: Check result of GlobalDeclarationInstantiation before evaluating Per step 13 of ScriptEvaluation in the ECMA-262 spec, the script body should only be evaluated if GlobalDeclarationInstantiation returned a normal completion. This can't currently be triggered since we always create fresh Script objects, but if we ever start reusing cached executables across evaluations, this would prevent a subtle bug where the script body runs despite GDI failing.	2026-02-19 12:02:50 +01:00
Andreas Kling	190b127981	LibJS: Consume semicolons after import statements Both return paths from parse_import_statement() were missing a call to consume_or_insert_semicolon(), causing explicit semicolons to be left unconsumed and parsed as spurious EmptyStatements.	2026-02-19 12:02:50 +01:00
Andreas Kling	0bd893d64f	LibJS: Fix source range of TaggedTemplateLiteral in class extends Use extends_start instead of rule_start so the TaggedTemplateLiteral gets the source position of the extends expression, not the class declaration.	2026-02-19 12:02:50 +01:00
Timothy Flynn	d1ed361239	LibJS: Cache the result of parsing time zone identifiers The result of parsing an identifier cannot change. It is not cheap to do so, so let's cache the result. This is hammered in a few test262 tests. On my machine, this reduces the runtime of each test/staging/sm/Date/dst-offset-caching-{N}-of-8.js by 0.3 to 0.5 seconds. For example: dst-offset-caching-1-of-8.js: Reduces from 1.2s -> 0.9s dst-offset-caching-3-of-8.js: Reduces from 1.5s -> 1.1s	2026-02-19 09:20:15 +01:00
Timothy Flynn	6e433f0d10	LibJS: Use infallible time zone parser in a couple more locations	2026-02-19 09:20:15 +01:00
Timothy Flynn	f9fa548d43	LibJS: Pass time zone strings around as String more regularly These are String from the outset, so this patch is almost entirely just changing function parameter types. This will allow us to cache time zone parse results without invoking any extra allocations.	2026-02-19 09:20:15 +01:00
Andreas Kling	a89cfdb1bb	LibJS: Propagate captures from nested functions in default expressions When a nested function (arrow or function expression) inside a default parameter expression captures a name that also has a body var declaration, the capture must propagate to the parent scope. Otherwise, the outer scope optimizes the binding to a local register, making it invisible to GetBinding at runtime.	2026-02-19 02:45:37 +01:00
Andreas Kling	afae23e270	LibJS: Don't optimize body vars to locals when referenced in defaults When a function has parameter expressions (default values), body var declarations that shadow a name referenced in a default parameter expression must not be optimized to local variables. The default expression needs to resolve the name from the outer scope via the environment chain, not read the uninitialized local. We now mark identifiers referenced during formal parameter parsing with an IsReferencedInFormalParameters flag, and skip local variable optimization for body vars that carry both this flag and IsVar (but not IsForbiddenLexical, which indicates parameter names themselves).	2026-02-19 02:45:37 +01:00
Andreas Kling	cd2576c031	LibJS: Mark block-scoped function declaration locals as initialized When emitting block declaration instantiation, we were not calling set_local_initialized() after writing block-scoped function declarations to local variables via Mov. This caused unnecessary ThrowIfTDZ checks to be emitted when those locals were later read. Block-scoped function declarations are always initialized at block entry (via NewFunction + Mov), so TDZ checks for them are redundant.	2026-02-19 02:45:37 +01:00
Andreas Kling	f9dfa4ef36	LibJS: Parse overflowing integer literals correctly When hex, octal, or binary integer literals overflow u64, we used to fall back to UINT64_MAX. This produced incorrect results for any value larger than 2^64. Fix this by accumulating the value as a double digit-by-digit when the u64 parse fails.	2026-02-19 02:45:37 +01:00
Andreas Kling	0332af5c6d	LibJS: Check lookahead before consuming async in static class method When parsing class elements, after consuming the `static` keyword, the parser would unconditionally consume `async` if it appeared next. This meant that for `static async()`, where `async` is the method name (not a modifier), the `async` token was consumed too early and its source position was lost. Fix this by applying the same lookahead check used for top-level async detection: only consume `async` as a modifier if the following token is not `(`, `;`, `}`, or preceded by a line terminator. This lets `async` be parsed as a property key with its correct source position.	2026-02-19 02:45:37 +01:00
Andreas Kling	ce3742724a	LibJS: Propagate declaration_kind for existing identifier groups When an identifier was registered and its group already existed but had no declaration_kind set, we failed to propagate it. This caused var declarations to lose their annotation in AST dumps when the identifier was referenced before its declaration.	2026-02-17 20:44:57 +01:00
Andreas Kling	47e552e8fd	LibJS: Consolidate TDZ check emission into Generator helper Move the duplicated ThrowIfTDZ emission logic from three places in ASTCodegen.cpp into a single Generator::emit_tdz_check_if_needed() helper. This handles both argument TDZ (which requires a Mov to empty first) and lexically-declared variable TDZ uniformly. This avoids emitting some unnecessary ThrowIfTDZ instructions.	2026-02-17 20:44:57 +01:00
Andreas Kling	9923745d34	LibJS: Remove unused bytecode register allocation in array destructuring	2026-02-17 20:44:57 +01:00
Jelle Raaijmakers	288c544827	LibJS: Don't duplicate var-scoped declarations per bound identifier ScopeCollector::add_declaration() was adding var declarations to the top-level scope's m_var_declarations once per bound identifier and once more after the for_each_bound_identifier loop - so a `var a, b, c` would be added 4 times instead of 1. The Script constructor iterates m_var_declarations and expands each entry's bound identifiers, resulting in O(N²) work for a single var statement with N declarators. Running the Emscripten-compiled version of ScummVM with a 32,174- declarator var statement, this produced over 1 billion entries, consuming 14+ GB of RAM and blocking the event loop for 35+ seconds. After this fix, this drops down to 200 MB and just short of 200ms.	2026-02-17 20:19:08 +01:00
Ben Wiederhake	dab83d35d1	AK: Remove unused include from ByteString	2026-02-17 12:38:51 +00:00
Callum Law	6294fb1f7a	LibJS: Mark `MapIterator` for export	2026-02-17 12:25:27 +00:00
Andreas Kling	19bf3f9479	LibJS: Use a forward cursor for source map lookup during compilation The find_source_record lambda was doing a reverse linear scan through the entire source map for every instruction emitted, resulting in quadratic behavior. This was catastrophic for large scripts like Octane/mandreel.js, where compile() dominated the profile at ~30s. Since both source map entries and instruction iteration are ordered by offset, replace the per-instruction scan with a forward cursor that advances in lockstep with instruction emission.	2026-02-16 20:41:02 +01:00
Shannon Booth	e859402ea1	LibJS: Remove duplicate property_key_to_value helper function An equivalent already exists as a member of PropertyKey.	2026-02-16 18:49:15 +01:00
Andreas Kling	1d145eec72	LibJS: Fix phantom source map entries from assembly-time optimizations The compile() function was adding source map entries for all instructions in a block upfront, before processing assembly-time optimizations (Jump-to-next-block elision, Jump-to-Return/End inlining, JumpIf-to-JumpTrue/JumpFalse conversion). When a Jump was skipped, its phantom source map entry remained at the offset where the next block's first instruction would be placed, causing binary_search to find the wrong source location for error messages. Fix by building source map entries inline with instruction emission, ensuring only actually-emitted instructions get entries. For blocks with duplicate source map entries at the same offset (from rewind in fuse_compare_and_jump), the last entry is used.	2026-02-15 23:21:46 +01:00
Andreas Kling	2dca137d9e	LibJS: Handle ThisExpression in expression_identifier() Add ThisExpression handling to the expression_identifier() helper used for base_identifier in bytecode instructions. This makes PutById and GetById emit base_identifier:this when the base is a this expression.	2026-02-15 23:21:46 +01:00

1 2 3 4 5 ...

2156 Commits