The LZW data for both GIF and TIFF images is sometimes intentionally
missing an end-of-information (EOI) code, which technically is a
decoding error, but in practive is handled gracefully by Firefox, Safari
and Chrome for GIFs and Safari for TIFFs. Let's mirror their behavior.
The included WPT test exposes the fact that trailing garbage bytes can
also result in decoding errors. We handle this in the LZW logic rather
than in the image decoding since our LZW implementation is currently
only used by GIF and TIFF decoding. The error is logged behind the
LZW_DEBUG flag.
For zlib is not necessarily an error state but the web standards do not
support this feature and the WPT tests explicitly check for this case
to be handled as an error.
If we find ourselves in a situation where zlib can't make any progress,
we don't have any more data to feed in and no output has been produced,
we need to raise an error as the compressed data is incomplete.
This used to lead to an infinite busy loop where we keep calling
zlib to decompressed but is not able. This causes the promise on the
read side of the transformer to never fulfill.
This gives us at least 24 more WPT tests :)
Instead of checking the header in ZlibDecompressor::create(), we now
check it in read_some() when it is called for the first time. This
resolves a FIXME in the new DecompressionStream implementation.
These compressors will be used by w3c's CompressionStream, which can run
arbitrary JS, and thus never reach their "finish" steps. Let's not crash
the WebContent process if that happens.
GzipCompressor is currently written assuming that it's write_some method
is only called once. When we use this class for LibWeb, we may very well
receive data to compress in small chunks. So this patch makes us write
the gzip header and footer only once, which now resembles the zlib and
deflate compressors.
Compared to version 10 this fixes a bunch of formatting issues, mostly
around structs/classes with attributes like [[gnu::packed]], and
incorrect insertion of spaces in parameter types ("T &"/"T &&").
I also removed a bunch of // clang-format off/on and FIXME comments that
are no longer relevant - on the other hand it tried to destroy a couple of
neatly formatted comments, so I had to add some as well.
OutputMemoryStream was originally a proxy for DuplexMemoryStream that
did not expose any reading API.
Now I need to add another class that is like OutputMemoryStream but only
for static buffers. My first idea was to make OutputMemoryStream do that
too, but I think it's much better to have a distinct class for that.
I originally wanted to call that class FixedOutputMemoryStream but that
name is really cumbersome and it's a bit unintuitive because
InputMemoryStream is already reading from a fixed buffer.
So let's just use DuplexMemoryStream instead of OutputMemoryStream for
any dynamic stuff and create a new OutputMemoryStream for static
buffers.
Consider the following snippet:
void foo(InputStream& stream) {
if(!stream.eof()) {
u8 byte;
stream >> byte;
}
}
There is a very subtle bug in this snippet, for some input streams eof()
might return false even if no more data can be read. In this case an
error flag would be set on the stream.
Until now I've always ensured that this is not the case, but this made
the implementation of eof() unnecessarily complicated.
InputFileStream::eof had to keep a ByteBuffer around just to make this
possible. That meant a ton of unnecessary copies just to get a reliable
eof().
In most cases it isn't actually necessary to have a reliable eof()
implementation.
In most other cases a reliable eof() is avaliable anyways because in
some cases like InputMemoryStream it is very easy to implement.
The streaming operator doesn't short-circuit, consider the following
snippet:
void foo(InputStream& stream) {
int a, b;
stream >> a >> b;
}
If the first read fails, the second is called regardless. It should be
well defined what happens in this case: nothing.
I suspected an error in CircularDuplexStream::read(Bytes, size_t). This
does not appear to be the case, this test case is useful regardless.
The following script was used to generate the test:
import gzip
uncompressed = []
for _ in range(0x100):
uncompressed.append(1)
for _ in range(0x7e00):
uncompressed.append(0)
for _ in range(0x100):
uncompressed.append(1)
compressed = gzip.compress(bytes(uncompressed))
compressed = ", ".join(f"0x{byte:02x}" for byte in compressed)
print(f"""\
TEST_CASE(gzip_decompress_repeat_around_buffer)
{{
const u8 compressed[] = {{
{compressed}
}};
u8 uncompressed[0x8011];
Bytes{{ uncompressed, sizeof(uncompressed) }}.fill(0);
uncompressed[0x8000] = 1;
const auto decompressed = Compress::GzipDecompressor::decompress_all({{ compressed, sizeof(compressed) }});
EXPECT(compare({{ uncompressed, sizeof(uncompressed) }}, decompressed.bytes()));
}}
""", end="")