ladybird

mirror of https://github.com/LadybirdBrowser/ladybird synced 2026-04-26 17:55:07 +02:00

Author	SHA1	Message	Date
Andreas Kling	b2b72a1884	LibJS: Defer regex literal compilation to post-parse step Move regex compilation out of the parsing hot path. Both the C++ and Rust parsers now collect raw regex pattern+flags strings during parsing and batch-compile them after parsing completes. This is a prerequisite for moving the Rust parser to a background thread, since LibRegex is thread-unsafe and FFI calls during parsing prevent parallelization. Flag validation remains in the parser since it's trivial string checking with no LibRegex dependency.	2026-03-06 13:06:05 +01:00
Andreas Kling	8bf1d749a1	LibJS: Suppress global identifier optimization for dynamic functions Functions created via new Function() cannot assume that unresolved identifiers refer to global variables, since they may be called in an arbitrary scope. Pass a flag through the scope collector analysis to suppress the global identifier optimization in this case.	2026-02-24 09:39:42 +01:00
Andreas Kling	51639fd555	LibJS: Move labels_in_scope out of ParserState Label scoping is already managed manually at function and arrow function boundaries via move + ScopeGuard, and speculative parse paths never modify labels before their potential failure points. Move the HashMap to a direct Parser member so it is no longer copied on every save_state()/load_state() cycle.	2026-02-10 02:05:20 +01:00
Andreas Kling	fa8f9a6b2c	LibJS: Remove invalid_property_range HashMap from ParserState Replace the HashMap<size_t, Position> used to communicate invalid object literal property ranges from parse_object_expression() to parse_expression() with a field on PrimaryExpressionParseResult. The range now flows through the parser's return values instead of being stashed in persistent ParserState, eliminating a HashMap that was never cleared and got bulk-copied on every save_state() call during speculative parsing.	2026-02-10 02:05:20 +01:00
Andreas Kling	a8a1aba3ba	LibJS: Replace ScopePusher with ScopeCollector Replace the ScopePusher RAII class (which performed scope analysis in its destructor chain during parsing) with a two-phase approach: 1. ScopeCollector builds a tree of ScopeRecord nodes during parsing via RAII ScopeHandle objects. It records declarations, identifier references, and flags, but does not resolve anything. 2. After parsing completes, ScopeCollector::analyze() walks the tree bottom-up and performs all resolution: propagate eval/with poisoning, resolve identifiers to locals/globals/arguments, hoist functions (Annex B.3.3), and build FunctionScopeData. Key design decisions: - ScopeRecord::ast_node is a RefPtr<ScopeNode> to prevent use-after-free when synthesize_binding_pattern re-parses an expression as a binding pattern (the original parse's scope records survive with stale AST node pointers). - Parser::scope_collector() returns the override collector if set (for synthesize_binding_pattern's nested parser), ensuring all scope operations route to the outer parser's scope tree. - FunctionNode::local_variables_names() delegates to its body's ScopeNode rather than copying at parse time, since analysis runs after parsing.	2026-02-10 02:05:20 +01:00
Andreas Kling	fa44fd58d8	LibJS: Remove ParserState::lookahead_lexer The lookahead lexer used by next_token() no longer needs to be kept alive, since tokens created by Parser::next_token() now have any string views guaranteed safe by the fact that they point into the one true SourceCode provided by whoever set up the lexer.	2025-11-09 12:14:03 +01:00
Andreas Kling	841fe0b51c	LibJS: Don't store current token in both Lexer and Parser Just give Parser a way to access the one stored in Lexer.	2025-11-09 12:14:03 +01:00
Timothy Flynn	b955c9b2a9	LibJS: Port the Identifier AST (and related) nodes to UTF-16 This eliminates quite a lot of UTF-8 / UTF-16 churn.	2025-08-13 09:56:13 -04:00
Timothy Flynn	00182a2405	LibJS: Port the JS lexer and parser to UTF-16 This ports the lexer to UTF-16 and deals with the immediate fallout up to the AST. The AST will be dealt with in upcoming commits. The lexer will still accept UTF-8 strings as input, and will transcode them to UTF-16 for lexing. This doesn't actually incur a new allocation, as we were already converting the input StringView to a ByteString for each lexer. One immediate logical benefit here is that we do not need to know off- hand how many UTF-8 bytes some special code points occupy. They all happen to be a single UTF-16 code unit. So instead of advancing the lexer by 3 positions in some cases, we can just always advance by 1.	2025-08-13 09:56:13 -04:00
Timothy Flynn	eb74781a2d	LibJS: Keep the lookahead lexer alive after parsing its next token Currently, the lexer holds a ByteString, which is always heap-allocated. When we create a copy of the lexer for the lookahead token, that token will outlive the lexer copy. The token holds a couple of string views into the lexer's source string. This is fine for now, because the source string will be kept alive by the original lexer. But if the lexer were to hold a String or Utf16String, short strings will be stored on the stack due to SSO. Thus the token will hold views into released stack data. We need to keep the lookahead lexer alive to prevent UAF on views into its source string.	2025-08-13 09:56:13 -04:00
ayeteadoe	2e2484257d	LibJS: Enable EXPLICIT_SYMBOL_EXPORT and annotate minimum symbol set	2025-07-22 11:51:29 -04:00
ayeteadoe	539a675802	LibJS: Revert Enable EXPLICIT_SYMBOL_EXPORT This reverts commit `c14173f651`. We should only annotate the minimum number of symbols that external consumers actually use, so I am starting from scratch to do that	2025-07-22 11:51:29 -04:00
Luke Wilde	3d43462ccd	LibJS: Implement the Dynamic Code Brand Checks stage 3 proposal This is an active proposal at stage 3 of the TC39 proposal process. See: https://tc39.es/proposal-dynamic-code-brand-checks/ See: https://github.com/tc39/proposal-dynamic-code-brand-checks This proposal essentially adds support for the TrustedScript type from the Trusted Types specification to eval and Function. This in turn pipes support for the type into the CSP hook to check if the CSP allows dynamic code compilation. However, it currently doesn't support ShadowRealms, so the implementation here is a close approximation, using PerformEval as the basis. See: https://github.com/tc39/proposal-dynamic-code-brand-checks/issues/19 This is required to support the new function signature for the CSP hook, and will allow us to slot in Trusted Types support in the future.	2025-07-09 15:52:54 -06:00
ayeteadoe	c14173f651	LibJS: Enable EXPLICIT_SYMBOL_EXPORT	2025-06-30 10:50:36 -06:00
Timothy Flynn	7280ed6312	Meta: Enforce newlines around namespaces This has come up several times during code review, so let's just enforce it using a new clang-format 20 option.	2025-05-14 02:01:59 -06:00
Aliaksandr Kalenik	7932091e02	LibJS: Allow using local variable for catch parameters Local variables are faster to access and if all catch parameters are locals we can skip lexical environment allocation.	2025-04-22 21:57:25 +02:00
Andreas Kling	ef4e7b7945	LibJS: Make JS parser emit accurate `this` insights for constructors This way we don't have to handle it when instantiating the constructor.	2025-04-08 18:52:35 +02:00
Andreas Kling	6c70dc5f09	LibJS: Create FunctionParameters earlier in the parser This avoids making multiple copies of the Vector<FunctionParameter> in the parser.	2025-03-27 19:50:13 +00:00
Andreas Kling	f1914893e9	LibJS+LibWeb: Remove more uses of DeprecatedFlyString	2025-03-24 22:27:17 +00:00
Andreas Kling	46a5710238	LibJS: Use FlyString in PropertyKey instead of DeprecatedFlyString This required dealing with substantial fallout.	2025-03-24 22:27:17 +00:00
Timothy Flynn	b64a355a30	LibJS: Remove support for the "assert" keyword for import attributes This was removed from the spec some time ago. See: https://github.com/tc39/proposal-import-attributes/commit/14286bb	2025-01-21 14:58:32 +01:00
Shannon Booth	f87041bf3a	LibGC+Everywhere: Factor out a LibGC from LibJS Resulting in a massive rename across almost everywhere! Alongside the namespace change, we now have the following names: * JS::NonnullGCPtr -> GC::Ref * JS::GCPtr -> GC::Ptr * JS::HeapFunction -> GC::Function * JS::CellImpl -> GC::Cell * JS::Handle -> GC::Root	2024-11-15 14:49:20 +01:00
Timothy Flynn	93712b24bf	Everywhere: Hoist the Libraries folder to the top-level	2024-11-10 12:50:45 +01:00
Andreas Kling	13d7c09125	Libraries: Move to Userland/Libraries/	2021-01-12 12:17:46 +01:00
AnotherTest	8ca0e8325a	LibJS: Don't save rule start positions along with the parser state This fixes #4617. Also fixes the small problem where some save states would be leaked.	2020-12-29 17:39:42 +01:00
AnotherTest	d0363bca01	LibJS: `save_state()' before creating a RulePosition Fixes #4617.	2020-12-29 10:51:33 +01:00
AnotherTest	b34b681811	LibJS: Track source positions all the way down to exceptions This makes exceptions have a trace of source positions too, which could probably be helpful in making fancier error tracebacks.	2020-12-29 00:58:43 +01:00
Linus Groh	abd49c174a	LibJS: Include source location hint in Parser::print_errors()	2020-12-06 18:52:52 +01:00
Andreas Kling	d617120499	LibJS: Parse "with" statements :^)	2020-11-28 17:16:48 +01:00
Linus Groh	39a1c9d827	LibJS: Implement 'new.target' This adds a new MetaProperty AST node which will be used for 'new.target' and 'import.meta' meta properties. The parser now distinguishes between "in function context" and "in arrow function context" (which is required for this). When encountering TokenType::New we will attempt to parse it as meta property and resort to regular new expression parsing if that fails, much like the parsing of labelled statements.	2020-11-02 22:40:59 +01:00
Linus Groh	e07a39c816	LibJS: Replace 'size_t line, size_t column' with 'Optional<Position>' This is a bit nicer for two reasons: - The absence of line number/column information isn't based on 'values are zero' anymore but on Optional's value - When reporting syntax errors with position information other than the current token's position we had to store line and column ourselves, like this: auto foo_start_line = m_parser_state.m_current_token.line_number(); auto foo_start_column = m_parser_state.m_current_token.line_column(); ... syntax_error("...", foo_start_line, foo_start_column); Which now becomes: auto foo_start= position(); ... syntax_error("...", foo_start); This makes it easier to report correct positions for syntax errors that only emerge a few tokens later :^)	2020-11-02 22:40:59 +01:00
Linus Groh	9e80c67608	LibJS: Fix "use strict" directive false positives By having the "is this a use strict directive?" logic in parse_string_literal() we would apply it to any string literal, which is incorrect and would lead to false positives - e.g.: "use strict" + 1 `"use strict"` "\123"; ({"use strict": ...}) Relevant part from the spec which is now implemented properly: [...] and where each ExpressionStatement in the sequence consists entirely of a StringLiteral token [...] I also got rid of UseStrictDirectiveState which is not needed anymore. Fixes #3903.	2020-11-02 13:13:54 +01:00
Linus Groh	563d3c8055	LibJS: Require initializer for 'const' variable declaration	2020-10-30 23:43:38 +01:00
Linus Groh	dca9e4ec10	LibJS: Implement rules for duplicate function parameters - A regular function can have duplicate parameters except in strict mode or if its parameter list is not "simple" (has a default or rest parameter) - An arrow function can never have duplicate parameters Compared to other engines I opted for more useful syntax error messages than a generic "duplicate parameter name not allowed in this context": "use strict"; function test(foo, foo) {} ^ Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in strict mode (line: 1, column: 34) function test(foo, foo = 1) {} ^ Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with default parameter (line: 1, column: 20) function test(foo, ...foo) {} ^ Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with rest parameter (line: 1, column: 23) (foo, foo) => {} ^ Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in arrow function (line: 1, column: 7)	2020-10-25 12:56:02 +01:00
Linus Groh	4fb96afafc	LibJS: Support LegacyOctalEscapeSequence in string literals https://tc39.es/ecma262/#sec-additional-syntax-string-literals The syntax and semantics of 11.8.4 is extended as follows except that this extension is not allowed for strict mode code: Syntax EscapeSequence:: CharacterEscapeSequence LegacyOctalEscapeSequence NonOctalDecimalEscapeSequence HexEscapeSequence UnicodeEscapeSequence LegacyOctalEscapeSequence:: OctalDigit [lookahead ∉ OctalDigit] ZeroToThree OctalDigit [lookahead ∉ OctalDigit] FourToSeven OctalDigit ZeroToThree OctalDigit OctalDigit ZeroToThree :: one of 0 1 2 3 FourToSeven :: one of 4 5 6 7 NonOctalDecimalEscapeSequence :: one of 8 9 This definition of EscapeSequence is not used in strict mode or when parsing TemplateCharacter. Note It is possible for string literals to precede a Use Strict Directive that places the enclosing code in strict mode, and implementations must take care to not use this extended definition of EscapeSequence with such literals. For example, attempting to parse the following source text must fail: function invalid() { "\7"; "use strict"; }	2020-10-24 16:34:01 +02:00
Linus Groh	80bb62b9cc	LibJS: Distinguish between statement and declaration This separates matching/parsing of statements and declarations and fixes a few edge cases where the parser would incorrectly accept a declaration where only a statement is allowed - for example: if (foo) const a = 1; for (var bar;;) function b() {} while (baz) class c {}	2020-10-23 19:13:06 +02:00
Linus Groh	15642874f3	LibJS: Support all line terminators (LF, CR, LS, PS) https://tc39.es/ecma262/#sec-line-terminators	2020-10-22 10:06:30 +02:00
Linus Groh	6331d45a6f	LibJS: Move checks for invalid getter/setter params to parse_function_node This allows us to provide better error messages as we can point the syntax error location to the exact first invalid parameter instead of always the end of the function within a object literal or class definition. Before this change: const Foo = { set bar() {} } ^ Uncaught exception: [SyntaxError]: Object setter property must have one argument (line: 1, column: 28) class Foo { set bar() {} } ^ Uncaught exception: [SyntaxError]: Class setter method must have one argument (line: 1, column: 26) After this change: const Foo = { set bar() {} } ^ Uncaught exception: [SyntaxError]: Setter function must have one argument (line: 1, column: 23) class Foo { set bar() {} } ^ Uncaught exception: [SyntaxError]: Setter function must have one argument (line: 1, column: 21) The only possible downside of this change is that class getters/setters and functions in objects are not distinguished in the message anymore - I don't think that's important though, and classes are (mostly) just syntactic sugar anyway.	2020-10-20 20:27:58 +02:00
Linus Groh	db75be1119	LibJS: Refactor parse_function_node() bool parameters into bit flags I'm about to add even more options and a bunch of unnamed true/false arguments is really not helpful. Let's make this a single parse options parameter using bit flags.	2020-10-20 20:27:58 +02:00
Linus Groh	46cc1f718e	LibJS: Unprefixed octal numbers are a syntax error in strict mode	2020-10-19 20:08:22 +02:00
Linus Groh	965d952ff3	LibJS: Share parameter parsing between regular and arrow functions This simplifies try_parse_arrow_function_expression() and fixes a few cases that should not produce an arrow function AST but did: (a,,) => {} (a b) => {} (a ...b) => {} (...b a) => {} The new parsing logic checks whether parens are expected and uses parse_function_parameters() if so, rolling back if a new syntax error occurs during that. Otherwise it's just an identifier in which case we parse the single parameter ourselves.	2020-10-19 11:31:55 +02:00
Matthew Olsson	e8da5f99b1	LibJS: break or continue with nonexistent label is a syntax error	2020-10-08 23:27:16 +02:00
Matthew Olsson	e49ea1b520	LibJS: Disallow 'continue' & 'break' outside of their respective scopes 'continue' is no longer allowed outside of a loop, and an unlabeled 'break' is not longer allowed outside of a loop or switch statement. Labeled 'break' statements are still allowed everywhere, even if the label does not exist.	2020-10-08 10:20:49 +02:00
Matthew Olsson	9a82c22a85	LibJS: Disallow 'return' outside of a function	2020-10-08 10:03:21 +02:00
Linus Groh	283ee678f7	LibJS: Validate all assignment expressions, not just "=" The check for invalid lhs and assignment to eval/arguments in strict mode should happen for all kinds of assignment expressions, not just AssignmentOp::Assignment.	2020-10-05 09:25:04 +02:00
Linus Groh	bc701658f8	LibJS: Use String::formatted() for parser error messages	2020-10-04 19:22:02 +02:00
Matthew Olsson	6eb6752c4c	LibJS: Strict mode is now handled by Functions and Programs, not Blocks Since blocks can't be strict by themselves, it makes no sense for them to store whether or not they are strict. Strict-ness is now stored in the Program and FunctionNode ASTNodes. Fixes issue #3641	2020-10-04 10:46:12 +02:00
Muhammad Zahalqa	5a2ec86048	LibJS: Parser refactored to use constexpr precedence table Replaced implementation dependent on HashMap with a constexpr PrecedenceTable based on array lookup.	2020-08-21 16:14:14 +02:00
Jack Karamanian	7533fd8b02	LibJS: Initial class implementation; allow super expressions in object literal methods; add EnvrionmentRecord fields and methods to LexicalEnvironment Adding EnvrionmentRecord's fields and methods lets us throw an exception when \|this\| is not initialized, which occurs when the super constructor in a derived class has not yet been called, or when \|this\| has already been initialized (the super constructor was already called).	2020-06-29 17:54:54 +02:00
Matthew Olsson	61ac1d3ffa	LibJS: Lex and parse regex literals, add RegExp objects This adds regex parsing/lexing, as well as a relatively empty RegExpObject. The purpose of this patch is to allow the engine to not get hung up on parsing regexes. This will aid in finding new syntax errors (say, from google or twitter) without having to replace all of their regexes first!	2020-06-07 19:06:55 +02:00

1 2

89 Commits