Avoid building a temporary Rust token vector before calling back into
C++. The tokenizer now invokes the callback as each token is produced,
while borrowing the already-filtered input for source slices.
Reserve an initial C++ token capacity from the input size so the common
path avoids repeated growth while appending the converted tokens.
With this change, the Rust CSS tokenizer is now ~1.3x faster than the
C++ CSS tokenizer at churning through all the https://vercel.com/ CSS.
test-css-tokenizer is updated to run both the C++ and Rust tokenizers
and compare their output, to ensure they behave identically. The Parser
still uses the C++ Tokenizer.
The LibWeb crate, FFI layer etc are all based on the existing ones for
other libraries.
This is a direct AI translation to get us started, and not idiomatic
Rust. Future work can be done to make it more sensible.