Commit Graph

13 Commits

Author SHA1 Message Date
Ali Mohammad Pur
9964c64446 LibWasm: Implement the i32 const/local fusions for i64 too 2026-02-02 14:11:49 +01:00
Ali Mohammad Pur
ae9ced65b7 LibWasm: Add a bunch more fused ops
- synthetic_argument_set, synthetic_argument_tee
- synthetic_local_get_0..7, synthetic_local_set_0..7
- synthetic_br_nostack, synthetic_br_if_nostack
- synthetic_local_copy for local-to-local copies
- synthetic_i32_{sub,mul,and,or,xor,shl,shru,shrs}2local
2026-02-02 14:11:49 +01:00
Ali Mohammad Pur
921373a045 LibWasm: Implement call argument forwarding using call records 2026-02-02 14:11:49 +01:00
Ali Mohammad Pur
b89ecfc6bc LibWasm: Split parameters from locals 2026-02-02 14:11:49 +01:00
Ali Mohammad Pur
d99f663b1a LibWasm: Implement parsing/validation for proposal exception-handling
Actual execution traps for now.
2025-10-15 01:26:29 +02:00
Ali Mohammad Pur
d6f3f5fd51 LibWasm: Implement proposal 'relaxed-simd' 2025-10-15 01:26:29 +02:00
Ali Mohammad Pur
6a6f747701 LibWasm: Add support for proposal 'tail-call' 2025-10-15 01:26:29 +02:00
Pavel Shliak
cdab6b0a2f LibWasm: Fix pushes for i16x8.replace_lane in Opcode table
The opcode entry declared i16x8_replace_lane with pushes = -1, but
replace_lane pops 2 (vector, lane value) and pushes 1 result vector.
Set pushes to 1 to match the other replace_lane opcodes.
2025-09-06 06:06:44 +02:00
Ali Mohammad Pur
22448b0c35 LibWasm: Move the interpreter IP out of the configuration object
This, along with moving the sources and destination out of the config
object, makes it so we don't have to double-deref to get to them on each
instruction, leading to a ~15% perf improvement on dispatch.
2025-08-26 15:20:33 +02:00
Ali Mohammad Pur
6732e1cdc3 LibWasm: Don't clobber registers on (most) calls
This still passes the values on the stack, but registers are now allowed
to cross a call boundary.
This is a very significant (>50%) improvement on the small call
microbenchmarks on my machine.
2025-08-26 15:20:33 +02:00
Ali Mohammad Pur
33cd5ae08c LibWasm: Fuse some very common instruction combos into specialised ops
Largely combinations of i32.const and local.get.
This shaves off at most single-digit% number of instructions from
dispatch, which translates to at most ~10% reduced dispatch time.

Across most benchmarks, this gains around ~5% perf increase.
2025-08-08 12:54:06 +02:00
Ali Mohammad Pur
0e5ecef848 LibWasm: Try really hard to avoid touching the value stack
This commit adds a register allocator, with 8 available "register"
slots.
In testing with various random blobs, this moves anywhere from 30% to
74% of value accesses into predefined slots, and is about a ~20% perf
increase end-to-end.

To actually make this usable, a few structural changes were also made:
- we no longer do one instruction per interpret call
- trapping is an (unlikely) exit condition
- the label and frame stacks are replaced with linked lists with a huge
  node cache size, as we only need to touch the last element and
  push/pop is very frequent.
2025-08-08 12:54:06 +02:00
Timothy Flynn
93712b24bf Everywhere: Hoist the Libraries folder to the top-level 2024-11-10 12:50:45 +01:00