Commit Graph

47 Commits

Author SHA1 Message Date
Andreas Kling
1f30af9f5a LibGC: Restructure GC report phases for incremental sweep
The phase breakdown was authored when sweep_dead_cells was the
single STW sweep phase, with sweep_callbacks and weak-container
work nested under it. After incremental sweep landed, that nesting
no longer matches reality: sweep_callbacks runs at STW every
collection, the weak-container prune is its own STW step, and
sweep_dead_cells only runs for CollectEverything.

Promote prune_weak_containers and sweep_callbacks to top-level
phases so they show up correctly in the report, and gate the
sweep_dead_cells subsection on a non-zero time so normal
collections no longer print a wall of zero rows.

The prune-weak-containers loop was previously untimed, leaving an
unaccounted gap in the per-GC totals. Wire it through the existing
ScopedPhaseTimer mechanism. The PhaseTimings field for it is
renamed from sweep_weak_containers_us to prune_weak_containers_us
to disambiguate from sweep_weak_blocks_us, which times a different
piece of work.
2026-05-10 10:58:11 +02:00
Andreas Kling
a4945a651f LibGC: Report incremental sweep batches
Record incremental sweep batch timing while LIBGC_LOG_LEVEL enables GC
reporting. Print the batch summary once the incremental sweep fully
finishes, so normal collection reports include the delayed sweep work
instead of leaving the sweep section empty for incremental collections.
2026-05-10 10:58:11 +02:00
Andreas Kling
d4aff987ab LibGC: Use live HeapBlock registry during GC instead of rebuilding it
Now that Heap maintains a persistent set of live heap blocks, use it
in the marking and conservative root scanning phases instead of
rebuilding a local copy on every collection cycle.
2026-05-10 10:58:11 +02:00
Andreas Kling
245b7d74a7 LibGC: Prune weak containers in stop-the-world phase of GC
Move weak container cleanup (remove_dead_cells) out of both
sweep_dead_cells() and start_incremental_sweep() to the place
where it is actually safe to inspect cell state: collect_garbage().

Previously, remove_dead_cells could access cells that had already
been swept and poisoned by ASAN, causing use-after-poison crashes
when a new GC triggered while an incremental sweep was in progress.
2026-05-10 10:58:11 +02:00
Andreas Kling
fb4095ae50 LibGC: Implement incremental sweeping for reduced GC pause times
Instead of sweeping all heap blocks in one go after marking, sweep
incrementally, one block at a time, interleaved with program execution.
This significantly reduces worst-case GC pause times by spreading
sweep work across multiple smaller time slices.

Sweep is driven by two complementary mechanisms:

1. Timer-based sweeping: A 16ms repeating timer drives background
   sweep work, processing blocks for up to 5ms per timer fire.

2. Allocation-directed sweeping: Each allocator sweeps its own
   pending blocks before creating new ones, ensuring forward
   progress even without timer events.

Each allocator maintains its own list of blocks pending sweep,
and allocators with pending work are tracked in a separate list
for efficient timer-driven sweeping.

Key implementation details:

- Newly allocated cells during sweep are marked immediately to
  prevent premature collection.

- Mark bits are cleared incrementally as each block is swept,
  rather than in a separate pass over the entire heap.

- Finalization and weak reference processing remain stop-the-world
  since they must complete atomically before any sweeping occurs.
2026-05-10 10:58:11 +02:00
Andreas Kling
3b9c55bd5c LibGC: Gate per-GC report on LIBGC_LOG_LEVEL environment variable
Reading LIBGC_LOG_LEVEL once at startup picks the verbosity for the
collection reports:

  0/default: silent.
  1:         per-GC report with totals and detailed per-phase timings.
  2+:        everything in level 1, plus the full block allocator dump.

The existing collect_garbage(..., print_report=true) entry point still
works and behaves like a per-call floor of level 1, so the DevTools
"collect-garbage" inspector request keeps emitting a report regardless
of the env var. The allocator dump is now off by default at level 1
since most reports do not need it.

Right-aligned the percentage column in the breakdown so the numbers
line up cleanly.
2026-05-10 10:58:11 +02:00
Andreas Kling
895def2bd5 LibGC: Add detailed per-phase timing breakdown to GC report
The per-collection report now includes microsecond timings (with
percentage of total) for each phase and major subphase:

  gather_roots
    must-survive scan / embedder roots / explicit roots
    conservative roots
      register scan / stack scan / conservative-vector / cell lookup
  mark_live_cells
    initial visit / BFS marking / clear uprooted
  finalize_unmarked_cells
  sweep_weak_blocks
  sweep_dead_cells
    block iteration / weak containers / sweep callbacks
    block reclassify / update threshold

Timings are recorded via a small RAII helper into a file-scope struct,
keeping all the plumbing inside Heap.cpp so the public Heap.h surface
stays untouched. Sweep stats now travel back to collect_garbage() the
same way, which lets the report move out of sweep_dead_cells() into a
single print_gc_report() helper run after every phase has completed.
2026-05-10 10:58:11 +02:00
Andreas Kling
93c2175fc7 LibGC: Report GC times in microseconds with human-readable byte counts
Switch the per-GC report to a precise timer and microsecond output, and
format all byte counts via human_readable_size so they read naturally
(e.g. "12.3 MiB" instead of "12923847 bytes").
2026-05-10 10:58:11 +02:00
kunlinglio
8a259c1cde LibGC: Remove size-based allocator support 2026-05-09 15:07:24 +02:00
Andreas Kling
adfc9d263f LibGC: Defer per-block madvise to a global background worker
deallocate_block() used to call MADV_FREE_REUSABLE / MADV_FREE /
MADV_DONTNEED inline on every freed block. With sweep typically
freeing many blocks per GC, the cumulative syscall cost shows up
as real GC pause time.

Move the work onto a single global "decommit worker" thread:

- deallocate_block now just poisons the slot and pushes it onto a
  per-allocator m_freshly_freed queue. No syscalls.
- allocate_block prefers m_freshly_freed over m_blocks, so a slot
  that's recycled before the worker sees it skips the
  REUSABLE/REUSE pair entirely. This is the main payoff.
- Heap::sweep_dead_cells kicks the worker at the end of sweep.
  The worker sleeps 50 ms after each kick to give the JS thread
  breathing room, then drains each registered allocator's
  m_freshly_freed, madvises slots in batches of 64 with
  sched_yield between batches, and splices them onto m_blocks.
- Per-allocator refcount + condvar lets ~BlockAllocator wait
  until the worker has dropped its reference before our storage
  goes away. (Chunks themselves remain leaked: type-isolated VM
  is permanent, so we never tear them down.)
2026-05-07 20:09:05 +02:00
Andreas Kling
8fc7dfad26 LibGC: Include external memory in GC thresholds
Add a Cell hook for externally owned memory, and retally live external
bytes while sweeping after a collection.

Use the combined live cell and external byte count when sizing the next
GC threshold. External allocation notifications also participate in the
allocation-since-GC trigger.
2026-05-07 10:03:09 +02:00
Andreas Kling
7d9074efa8 LibGC: Tune heap allocation threshold and growth factor
Establish post-collection heap thresholds with a 1.75x growth factor
over live byte count, with an 8 MiB minimum.

These constants were chosen based on a benchmark sweep of the
Speedometer browser benchmarks (and a wider set of JS workloads) — see
the PR description for the data behind the choice.

Keep the constants in Heap.cpp instead of Heap.h so future tweaks don't
trigger 1000+ file rebuilds.
2026-05-07 10:03:09 +02:00
Jelle Raaijmakers
4bba63839e LibGC: Do not inadvertently resurrect stale pointers in sanitizer builds
In sanitizer builds, we need to convert the fake ASan stack pointers to
the real one in order to perform a conservative scan. We were blindly
scanning these stack frames regardless of whether they belong to the
_active_ stack range, i.e. the current function's frame and everything
above it. It's very likely that stale pointers exist below the stack
pointer, and we now take care to exclude that range.

Fixes a flake in the LibJS builtins/WeakRef/WeakRef.prototype.deref.js
test.
2026-04-30 14:20:39 +02:00
Andreas Kling
fe48e27a05 LibJS: Replace GC::Weak with GC::RawPtr in inline cache entries
Property lookup cache entries previously used GC::Weak<T> for shape,
prototype, and prototype_chain_validity pointers. Each GC::Weak
requires a ref-counted WeakImpl allocation and an extra indirection
on every access.

Replace these with GC::RawPtr<T> and make Executable a WeakContainer
so the GC can clear stale pointers during sweep via remove_dead_cells.

For static PropertyLookupCache instances (used throughout the runtime
for well-known property lookups), introduce StaticPropertyLookupCache
which registers itself in a global list that also gets swept.

Now that inline cache entries use GC::RawPtr instead of GC::Weak,
we can compare shape/prototype pointers directly without going
through the WeakImpl indirection. This removes one dependent load
from each IC check in GetById, PutById, GetLength, GetGlobal, and
SetGlobal handlers.
2026-03-08 10:27:13 +01:00
Zaggy1024
d0e7ffdf37 LibGC: Add missing root type switch cases in graph dump
Also, remove the default case so that we don't end up with missing
cases again.
2026-03-03 11:26:42 -06:00
Zaggy1024
7339c26059 LibGC: Use AK::unwind_stack_from_frame_pointer for stack pointer traces
This should allow it to function on RISC-V.
2026-03-03 11:26:42 -06:00
Zaggy1024
a3922aa570 LibGC+Meta: Include inline frames in the GC explorer's stack trace 2026-03-01 21:50:51 +01:00
Zaggy1024
03284abee9 LibGC+Meta: Display a frame size in the GC graph explorer's stack trace
This can help us track down overly-large frames that might be
contributing stale roots.
2026-03-01 21:50:51 +01:00
Zaggy1024
1a3e8fdf60 LibGC+Meta: Look up and display stack frames for StackPointer roots
This adds a stack trace to the JSON output from GC graph dumps which
is shown in a default-collapsed tray on the right side of the graph
explorer. When a stack pointer root is selected, the stack frame it
originated from is highlighted in the tray.
2026-03-01 21:50:51 +01:00
Ben Wiederhake
af489080c4 LibGC: Remove unused header in Heap 2026-02-23 12:15:23 +01:00
Shannon Booth
95e13f71a9 LibGC: Prefer Optional<StringView> for CellAllocator class name
In an effort towards removing the use of the null state of StringView.
2026-02-21 12:37:44 +01:00
Andreas Kling
4a3ee27702 LibGC: Use Vector::grow_capacity() in MarkingVisitor
Using ensure_capacity() was a mistake, as that API is for specifying an
exact needed capacity, while grow_capacity() is for growing at a
reasonable rate.

Amusingly, we ended up with very different behavior on macOS and Linux
here, since ensure_capacity() calls kmalloc_good_size() which quantizes
to malloc bucket sizes on macOS, but is effectively a no-op on Linux.

Extreme slowdown on Linux caught by GarBench/marking-stress.js
2026-01-08 21:42:01 +01:00
Andreas Kling
8b19992f8c LibGC: Make MarkingVisitor better at bulk-visiting Vector<JS::Value>
When passing a Vector<JS::Value> to the MarkingVisitor, we were
iterating over the vector and visiting one value at a time. This led
to a very inefficient way of building up the GC's work queue.

By adding a new visit_impl() virtual to Cell::Visitor, we can now
grow the work queue capacity once, and then add without incrementally
growing the storage.
2026-01-07 20:51:17 +01:00
Andreas Kling
2ac363dcba LibGC: Only call finalize() on types that override finalize()
This dramatically cuts down on time spent in the GC's finalizer pass,
since most types don't override finalize().
2026-01-07 20:51:17 +01:00
Andreas Kling
75ad452099 LibGC: Remove one redundant HeapBlock enumeration pass in GC
We were enumerating all HeapBlocks twice to build a HashTable of all
live blocks. With this commit, we only do it once.
2026-01-07 20:51:17 +01:00
Andreas Kling
280049e52d LibGC+LibWeb: Only ask relevant cell types if they must survive GC
Instead of checking if every single cell overrides the "must survive GC"
virtual, we can make this a HeapBlock level thing.

This avoids almost an entire GC heap traversal during the mark phase.
2026-01-07 20:51:17 +01:00
Aliaksandr Kalenik
763d638353 LibGC: Fix incorrect &cell key in GraphConstructorVisitor 2025-12-27 19:06:56 +01:00
Aliaksandr Kalenik
c26c9a9e45 LibGC: Skip not live cells in GraphConstructorVisitor
Makes `GraphConstructorVisitor` consistent with `MarkingVisitor`.
2025-12-27 19:06:56 +01:00
Aliaksandr Kalenik
ed58f85b75 LibGC: Introduce separate GC root type for "must survive GC" cells
This way `GraphConstructorVisitor` is aware of `MustSurviveGC` roots and
could include them in a dump.
2025-12-27 19:06:56 +01:00
Aliaksandr Kalenik
bfd9658181 LibGC: Add handling for ConservativeVector in dump() 2025-12-26 19:48:46 +01:00
Andreas Kling
2d29ca7e59 LibGC: Dump GC block allocator stats before running post-GC tasks
Post-GC tasks may trigger another GC, and things got very confusing
when that happened. Just dump all stats before running tasks.

Also add a separate Heap function to run these tasks. This makes
backtraces much easier to understand.
2025-12-25 20:21:37 +01:00
Andreas Kling
2a4a8a15f5 LibGC: Make must_survive_garbage_collection() actually work
This had two fatal bugs:

1. We didn't actually mark the cell that must survive GC, we only
   visited its edges.

2. Worse, we didn't actually mark anything at all! We just added
   cells to MarkingVisitor's work queue, but this happened after
   the work queue had already been processed.

This commit fixes these issues by moving the "must survive" pass
earlier in the mark phase.
2025-12-25 20:21:37 +01:00
Andreas Kling
710ea3e20a LibGC: Correct "reserved" field calculation in GC block allocator dumps 2025-12-21 12:08:41 -06:00
Andreas Kling
e9b0ef0afa LibGC: Add allocator statistics to post-GC report 2025-12-19 20:21:07 -06:00
Andreas Kling
716e5f72f2 LibGC: Always use 16 KiB as HeapBlock size
Before this change, we'd use the system page size as the HeapBlock
size. This caused it to vary on different platforms, going as low
as 4 KiB on most Linux systems.

To make this work, we now use posix_memalign() to ensure we get
size-aligned allocations on every platform.

Also nice: HeapBlock::BLOCK_SIZE is now a constant.
2025-12-19 20:21:07 -06:00
Andreas Kling
8289b24a7e LibJS: Introduce VM::the() and use it instead of caching VM pointer
In our process architecture, there's only ever one JS::VM per process.
This allows us to have a VM::the() singleton getter that optimizes
down to a single global access everywhere.

Seeing 1-2% speed-up on all JS benchmarks from this.
2025-12-09 11:58:39 -06:00
Andreas Kling
d234e9ee71 LibGC: Add GC::Heap::the()
There's only ever one GC::Heap per process, so let's have a way to find
it even when you have no context.
2025-11-01 08:40:32 +01:00
Andreas Kling
25a5ed94d6 LibGC: Add GC::Weak<T> as an alternative to AK::WeakPtr<T>
This is a weak pointer that integrates with the garbage collector.
It has a number of differences compared to AK::WeakPtr, including:

- The "control block" is allocated from a well-packed WeakBlock owned by
  the GC heap, not just a generic malloc allocation.

- Pointers to dead cells are nulled out by the garbage collector
  immediately before running destructors.

- It works on any GC::Cell derived type, meaning you don't have to
  inherit from AK::Weakable for the ability to be weakly referenced.

- The Weak always points to a control block, even when "null" (it then
  points to a null WeakImpl), which means one less null check when
  chasing pointers.
2025-10-17 17:22:16 +02:00
Andreas Kling
11ece7de10 LibGC: Add GC::RootHashMap<...> template container
This is a GC-aware wrapper around AK::HashMap. Entry values are treated
as GC roots, much like the GC::RootVector we already had.

We also provide GC::OrderedRootHashMap as a convenience.
2025-05-03 17:33:54 +02:00
Luke Wilde
5146bbe296 LibGC: Visit the edges of the cells that must survive garbage collection
Previously, we would only keep the cell that must survive alive, but
none of it's edges.

This cropped up with a GC UAF in must_survive_garbage_collection of
WebSocket in .NET's SignalR frontend implementation, where an
out-of-scope WebSocket had it's underlying EventTarget properties
garbage collected, and must_survive_garbage_collection read from the
destroyed EventTarget properties.

See: https://github.com/dotnet/aspnetcore/blob/main/src/SignalR/clients/ts/signalr/src/WebSocketTransport.ts#L81
Found on https://www.formula1.com/ during a live session.

Co-Authored-By: Tim Flynn <trflynn89@pm.me>
2025-02-27 14:35:28 -05:00
Timothy Flynn
bc54c0cdfb AK+Everywhere: Store JSON strings as String 2025-02-20 19:27:51 -05:00
Timothy Flynn
70eb0ba1cd AK+Everywhere: Remove the char const* JSON value constructor 2025-02-20 19:27:51 -05:00
Andreas Kling
51a91771b8 LibJS+LibGC: Run FinalizationRegistry cleanup host hook *after* GC
Before this change, it was possible for a second GC to get triggered
in the middle of a first GC, due to allocations happening in the
FinalizationRegistry cleanup host hook. To avoid this causing problems,
we add a "post-GC task" mechanism and use that to invoke the host hook
once all other GC activity is finished, and we've unset the "collecting
garbage" flag.

Note that the test included here only fails reliably when running with
the -g flag (collect garbage after each allocation).

Fixes #3051
2025-01-23 12:10:21 +01:00
InvalidUsernameException
01f8ab35f1 LibGC: Rename remaining occurrence of marked vector
In 3bfb0534be `MarkedVector` was renamed to `RootVector`, but some
related symbols were missed. This commit corrects this.
2025-01-02 16:22:29 -07:00
Andreas Kling
3bfb0534be LibGC: Rename MarkedVector => RootVector
Let's try to make it a bit more clear that this is a Vector of GC roots.
2024-12-26 19:10:44 +01:00
Pavel Shliak
03ac6e6e87 LibGC: Preallocate space before dumping GC graph
Speeds up the append_gc_graph function by preallocating space.
This change reduces the time taken to dump the GC graph by 4%
on about:blank.
2024-12-14 09:06:58 +01:00
Shannon Booth
f87041bf3a LibGC+Everywhere: Factor out a LibGC from LibJS
Resulting in a massive rename across almost everywhere! Alongside the
namespace change, we now have the following names:

 * JS::NonnullGCPtr -> GC::Ref
 * JS::GCPtr -> GC::Ptr
 * JS::HeapFunction -> GC::Function
 * JS::CellImpl -> GC::Cell
 * JS::Handle -> GC::Root
2024-11-15 14:49:20 +01:00