Files
serenity/Userland/Libraries/LibCompress
Nico Weber b61e670122 LibCompress: Speed up CanonicalCode::read_symbol() slow path
Symbols that need <= 8 bits hit a fast path as of #18075, but
the slow path has done a full binary search over all symbols
ever since this code was added in #2963. (#3405 even added a FIXME
for doing this, but #18075 removed it.)

Instead of doing a binary search over all codes for every single
bit read, this implements the Moffat-Turpin approach described at
https://www.hanshq.net/zip.html#huffdec, which only requires a
table read per bit.

    hyperfine 'Build/lagom/bin/unzip ~/Downloads/enwik8.zip'
    1.008 s ± 0.016 s  =>  957.7 ms ± 3.9 ms, 5% faster

Due to issue #25005, we can't peek the full 15 bits at once but
have to read them one-by-one. This makes the code look a bit
different than in the linked article.

I also tried not changing CanonicalCode::from_bytes() too much.
It does 15 passes over all symbols. I think it could do it in
a single pass instead. But that's for a future change.

No behavior change (other than slightly faster perf).
2024-09-14 13:20:48 +02:00
..
2024-09-09 23:25:08 +02:00
2023-03-21 10:25:13 +01:00
2024-05-14 12:33:53 -06:00
2024-09-09 23:25:08 +02:00