Brings wow.apng from 1.2M to 606K, while reducing encoding time from
233 ms to 167 ms.
(For comparison, writing wow.webp currently takes 88ms and produces
a 255K file. The input wow.gif is 184K.)
...for images that don't use a color indexing transform.
For now, write the predictor transform unconditionally, and predict
every pixel as its left neighbor (that is, use predictor 1 everywhere).
This will grow a better heuristic in time, and eventually we might want
to use more than 1 of the 13 different predictor modes. But on average,
doing this unconditionally is better than not doing it unconditionally.
(Also, it's possible to disable this transform using `image`'s
`--webp-allowed-transforms` flag.)
Using the same benchmark as in #24819, this reduces the total size
of the test images further, from 88M to 69M (21.6%). It does increase
runtime for compressing all these images from about 4.4s to 4.7s,
a 6.8% slowdown.
For the usual test image (no effect on the two animations, which use
the color indexing transform):
sunset_retro.png (876K):
1.2M -> 730K, 31.4 ms ± 0.8 ms -> 31.4 ms ± 0.7 ms
From 37% larger than the input to 16% _smaller_ than the input :^)
(A 39% size reduction for this image.)
Compressing sunset_retro.png with zopflipng produces a 820K image,
so we're smaller even than what the best PNG encoder can do with
this image.
It's not a win for all images: For example, the size of
qoi_benchmark_suite/screenshot_web/sublime.png goes from 1.0M to 1.5M,
a fairly dramatic size increase. Hopefully we can get that back in
the future with better heuristics. (The input sublime.png is 1.3M,
and sublime.png encoded using our QOI encoder creates a 1.6M file,
so the 1.5M isn't atrocious, even though it's much bigger than the
size without the predictor transform.)
Our WebP Lossless writer currently now has a similar feature set
to our QOI writer: RLE, color cache, left prediction, and
subtract green is somewhat similar to QOI's luma chunk. The WebP
writer also writes huffman trees, which QOI doesn't do. For most
images, the WebP writer creates smaller files than the QOI writer,
while being about 50% slower. The WebP writer also writes smaller
files than Serenity OS's png writer, while being ~40x as fast
(for sunset_retro, ~30ms instead of ~1.3s; 730K output instead of
999K).
Only doing RLE and using a single predictor for the entire image is
similar to what fpng is doing (...but fpng uses the T predictor always).
We still don't write a meta prefix image to keep the huffman trees
flatter, we still don't do full LZ77 backward matches, and we still
don't write color transforms. But the writer has now enough features
to be in usable shape. It's now Serenity OS's best-compressing lossless
image writer.
The test_webp_color_indexing_transform_single_channel test uses a
linear horizontal gradient, which the predictor transform compresses
so well that encoded_data_without_color_indexing ends up being
smaller than the color indexed file. Just disable the predictor
transform in that test for now.
We used to write a color indexing transform for grayscale images.
This stores a palette and then indexes into that palette.
For grayscale images with constant alpha, we don't need to store
the palette image: Using a subtract green transform has the same effect.
(For animations, most frames don't have constant alpha because
AnimationWriter replaces identical pixels with transparent black,
making sure that these frames have a mix of opaque and transparent
pixels. But the first frame of a grayscale animation will use this.)
Only saves a couple of bytes for storing the palette image,
but it's also free in term of performance, and it's conceptually
pleasing.
Reduces the size of 7z7c.webp by 30 bytes, from 8818 to 8788 bytes.
Lossless WebP allows having a 1-bit to 11-bit addressed
"color cache", where pixels are inserted into a content-addressed
cache of size `1 << color_cache_bits`. Pixels in the color
cache can be addressed using their index. This can be used
to refer to literal pixels using a single color_cache_bits
large symbol, instead of up to 4 symbols for GBRA.
We default to always using a color cache with 6 bits, unless
the input image already uses only a single channel already
(either as-is, or if we write a color indexing transform).
Due to this change, the size of the first prefix group
changes from being known at compile time (256 + 24)
to being known at runtime (256 + 24 + color_cache_size).
Change a few Array<>s to Vector<>s to make this work.
sunset_retro.png (876K):
1.6M -> 1.4M, 29.1 ms ± 0.9 ms -> 31.7 ms ± 0.9 ms
From 83% larger than the input file to 60% larger (12.5% smaller),
for a 9% slowdown.
The two gifs I usually test with don't change: Files using the
color _index_ transform (i.e. that have < 256 colors) don't
use the color _cache_ in our encoder.
The color indexing transform shouldn't make single-channel images
larger (by needlessly writing a palette). If there <= 16 colors
in the single channel, it should make the image smaller.
...and use a different color name until a (relatively harmless) bug
writing fully-opaque frames to an animation that also has transparent
frames is fixed. (I've had a local fix for that for a while, but
I'm waiting for #24397 to land.)
To determine the palette of colors we use the median cut algorithm.
While being a correct implementation, enhancements are obviously
existing on both the median cut algorithm and the encoding side.
For example, for 7z7c.gif, we now store one 500x500 frame and then
a 94x78 frame at (196, 208) and a 91x78 frame at (198, 208).
This reduces how much data we have to store.
We currently store all pixels in the rect with changed pixels.
We could in the future store pixels that are equal in that rect
as transparent pixels. When inputs are gif files, this would
guaranteee that new frames only have at most 256 distinct colors
(since GIFs require that), which would help a future color indexing
transform. For now, we don't do that though.
The API I'm adding here is a bit ugly:
* WebPs can only store x/y offsets that are a multiple of 2. This
currently leaks into the AnimationWriter base class.
(Since we potentially have to make a webp frame 1 pixel wider
and higher due to this, it's possible to have a frame that has
<= 256 colors in a gif input but > 256 colors in the webp,
if we do the technique above.)
* Every client writing animations has to have logic to track
previous frames, decide which of the two functions to call, etc.
This also adds an opt-out flag to `animation`, because:
1. Some clients apparently assume the size of the last VP8L
chunk is the size of the image
(see https://github.com/discord/lilliput/issues/159).
2. Having incremental frames is good for filesize and for
playing the animation start-to-end, but it makes it hard
to extract arbitrary frames (have to extract all frames
from start to target frame) -- but this is mean tto be a
delivery codec, not an editing codec. It's also more vulnerable to
corrupted bytes in the middle of the file -- but transport
protocols are good these days.
(It'd also be an idea to write a full frame every N frames.)
For https://giphy.com/gifs/XT9HMdwmpHqqOu1f1a (an 184K gif),
output webp size goes from 21M to 11M.
For 7z7c.gif (an 11K gif), output webp size goes from 2.1M to 775K.
(The webp image data still isn't compressed at all.)
Two bugs:
1. Correctly set bits in VP8X header.
Turns out these were set in the wrong order.
2. Correctly set the `has_alpha` flag.
Also add a test for writing webp files with icc data. With the
additional checks in other commits in this PR, this test catches
the bug in WebPWriter.
Rearrange some existing functions to make it easier to write this test:
* Extract encode_bitmap() from get_roundtrip_bitmap().
encode_bitmap() allows passing extra_args that the test uses to pass
in ICC data.
* Extract expect_bitmaps_equal() from test_roundtrip()
Explicit template arguments must be wrapped in parens,
else they confuse the preprocessor.
Add the parens instead of avoiding the use of explicit template
arguments.
No behavior change.