Commit Graph

96 Commits

Author SHA1 Message Date
Zaggy1024
acd4ea9129 LibMedia: Ensure that Matroska seeks after a get_frames() call fails
This fixes a rare flake in HTMLMediaElement-load-after-decode-error,
where if the demuxer managed to get a block successfully but have its
reads in get_frames() aborted by the initial seek, it would skip the
single bad frame and never fire the error event.
2026-04-10 15:21:07 -05:00
Zaggy1024
9d26a274e4 LibMedia: Allow demuxers to report buffered times to PlaybackManager
PlaybackManager then intersects all enabled tracks' buffered time
ranges. This will be used by the media element for the buffered
attribute and to update the ready state.
2026-04-01 02:54:22 -05:00
Zaggy1024
5083a96c90 LibMedia: Add a declaration for the Matroska Chapters element
We'll need to check for this element in the MSE WebM byte stream parser
in order to skip it.
2026-04-01 02:54:22 -05:00
Zaggy1024
09f3b8015d LibMedia: Specify WebM track kinds based on the track sourcing spec
This should bring is completely in line with the spec. We specify all
non-WebM text tracks as kind "subtitles", per the spec note.
2026-04-01 02:54:22 -05:00
Zaggy1024
498e571772 LibMedia: Move Matroska::track_from_track_entry to Document.h
This will need to be used by the MSE WebM byte stream parser.
2026-04-01 02:54:22 -05:00
Zaggy1024
b4db8f11c5 LibMedia+LibWeb: Align Media::Track more to the web spec
...giving tracks a kind attribute, and renaming name to label.

Demuxers will need to determine the kind attribute, since the spec for
sourcing tracks requires us to select based on info we don't expose.
2026-04-01 02:54:22 -05:00
Zaggy1024
d6f821f22d LibMedia: Parse Opus frame durations in Matroska::Reader
Most WebM files don't have their default duration defined, so we need
to parse the Opus frame header to determine the duration. This is
needed for buffered range calculation.
2026-04-01 02:54:22 -05:00
Zaggy1024
6b45a11716 LibMedia: Calculate Matroska block timestamps for their actual tracks
Instead of using a single track entry for all blocks in the file, use a
lookup to get the info needed to calculate the timestamp for the
specific track a block belongs to. No change in behavior for
SampleIterator, since that only returns blocks from the track that was
passed. This will be useful for MSE, since it demuxes all tracks at
once.
2026-04-01 02:54:22 -05:00
Zaggy1024
ca811192ab LibMedia: Make Matroska::Reader::duration() const 2026-04-01 02:54:22 -05:00
Zaggy1024
b23abd4b11 LibMedia: Simplify reading of Matroska VINTs and handle unknown sizes
Unknown sizes are referenced in the MSE WebM spec, so we need to expose
them in the reader functions.
2026-04-01 02:54:22 -05:00
Zaggy1024
25754993ce LibMedia: Add a method to read a Matroska signed integer and use it
The TrackOffset element doesn't use a signed VINT, but rather a VINT
size followed by an integer of that size.
2026-04-01 02:54:22 -05:00
Zaggy1024
c867a5b4cd LibMedia: Ensure Matroska integer elements aren't too large
The EBML spec mandates that these don't exceed 8 bytes.

Also, add a check for a length of 0 for floats, this is allowed by the
spec.
2026-04-01 02:54:22 -05:00
Zaggy1024
227d78b16c LibMedia: Expose some Matroska element parsing functions for MSE
We'll want to reuse these for a byte stream parser, which has some more
restrictions over what we do in Matroska::Reader.
2026-04-01 02:54:22 -05:00
Zaggy1024
cd5daa5135 LibMedia: Move Matroska element IDs to their own file
These will be used to sniff for initialization segments in MSE.
2026-04-01 02:54:22 -05:00
Andreas Kling
eb789e790e Everywhere: Use AK::SaturatingMath and remove Checked saturating APIs
Port all callers of Checked<T>::saturating_add/sub/mul to the new
standalone functions in AK/SaturatingMath.h, and remove the old
APIs from Checked.
2026-03-21 18:20:09 -05:00
Zaggy1024
0c6de059d9 LibMedia: Clear Matroska::SampleIterator's last timestamp at EOS
If we don't clear this, then a seek to a timestamp after the
presentation timestamp of the last frame will not result in us decoding
and displaying that last frame again.
2026-02-06 13:28:09 +01:00
Zaggy1024
9fafd32bb7 LibMedia: Silence the EOS error when seeking in Matroska::Reader
If we encountered EOS while seeking forward, instead of returning the
last keyframe, we would return an error. This prevents us from decoding
the last frame if we're seeking to a timestamp after its presentation
timestamp.
2026-02-06 13:28:09 +01:00
Zaggy1024
a05da1b7d0 LibMedia: Make Matroska::SampleIterator movable again
It seems that at some point it became non-movable, causing some
warnings where we were trying to avoid unnecessary copies.
2026-02-06 13:28:09 +01:00
Zaggy1024
972438c4d7 LibMedia: Abstract the interface of IncrementallyPopulatedStream
The way that other classes interact with IncrementallyPopulatedStream
is now through a virtual interface MediaStream and MediaStreamCursor.
This way, we can have simpler implementations of reading media data
that will not require an RB tree and synchronization.
2026-01-30 10:02:00 -06:00
Zaggy1024
75231e63b1 LibMedia: Only pass Demuxer to the data providers
...and abstract away the stream/cursor blocking/aborting functionality
so that demuxers can implement or ignore those methods as they see fit.

This is a step towards implementing a wrapper demuxer for MSE streams.
2026-01-30 10:02:00 -06:00
Zaggy1024
ba8676fbd5 LibMedia: Mark MatroskaDemuxer override methods virtual 2026-01-30 10:02:00 -06:00
Zaggy1024
a5153d05e7 LibMedia: Avoid unnecessary stream RefPtr copies in Matroska::Reader 2026-01-29 05:22:27 -06:00
Zaggy1024
ee95de40d6 LibMedia: Validate fixed-size Matroska frames
We were allowing Matroska blocks with fixed-size lacing to contain
frames with non-divisible sizes. This should not be possible, as it
inherently means that trailing bytes will be discarded.

We now have a valid and invalid testcase for fixed-size lacing to
ensure our handling remains correct.
2026-01-28 16:12:22 -06:00
Zaggy1024
f0d7d1d5f5 LibMedia: Track Matroska master element ends with position()
We don't actually need a Vector stack of bytes read for each element
we're reading out of a Matroska file, we already have the C++ stack
in which we can store the start and end of the master elements we're
reading.

This fixes an issue where seeks while parsing master elements would not
increment m_octets_read, so the master element could continue reading
further than intended.

This could cause a BlockGroup followed by a SimpleBlock to read as if
the BlockGroup contained the SimpleBlock, meaning that SampleIterator
would skip the SimpleBlock.

A test is added to ensure this doesn't regress again.
2026-01-28 14:48:03 -06:00
Zaggy1024
35fc061795 LibMedia: Return the original sample iterator in Matroska slow seeks
This was incorrectly advancing the passed-in iterator, which could
leave it at the position of a non-keyframe and cause decoding errors.
Instead, store the original iterator non-optionally, and return that
when the it is the best position for the seek target.
2026-01-21 11:35:33 +01:00
Zaggy1024
9a421ffe9f LibMedia: Make demuxers thread-safe and remove MutexedDemuxer 2026-01-07 00:13:32 +01:00
Aliaksandr Kalenik
4261d73781 LibMedia: Remove m_stream_cursor from Matroska::Reader
Pass the Streamer through parsing methods during initialization instead
of storing the cursor as a member. This makes Reader effectively
read-only after construction, improving thread safety since the cursor
is no longer shared state.
2026-01-07 00:13:32 +01:00
Zaggy1024
b77980b4cc LibMedia: Move Matroska's get_frames method to SampleIterator
This ensures that we're using the reader for the particular thread that
the block was read from, avoiding any race conditions between seeks and
reads across threads.
2026-01-07 00:13:32 +01:00
Zaggy1024
5da08f59b0 LibMedia: Remove a function declaration from Matroska::SampleIterator
This was unused.
2026-01-07 00:13:32 +01:00
Zaggy1024
0170111bb5 LibMedia: Stop lazily parsing data in Matroska::Reader
This wasn't really gaining us anything in practice, and it makes it
harder to make sample iterators thread-safe.
2026-01-07 00:13:32 +01:00
Zaggy1024
40c170dac3 LibMedia: Handle nested Matroska Segment elements
This is malformed input, but it's useful for testing MSE buffers and
doesn't cause any additional issues.

It won't actually be used for MSE playback, as that will be able to
detect a format change that precedes the new initialization data
containing the new EBML and Segment elements.
2026-01-07 00:13:32 +01:00
Zaggy1024
8360f2e6f8 LibMedia: Don't error if the Matroska Cues element is not found
We can handle seeking without cues, so there's no reason to fail here.
2026-01-07 00:13:32 +01:00
Zaggy1024
d9f9797a7c LibMedia: Locate top-level Matroska elements from their element IDs
The SeekPosition element specifies the positions relative to the
element ID field in the SeekHead element, not the size. The previous
method of looking them up just happened to work because all top-level
elements have 4-byte IDs.

Correcting this allows us to add a check to make sure the element that
the SeekHead points to is actually the element we're looking for.

We could potentially separate the parsed SeekHead positions from the
ones that are determined by scanning the file, so that we can fall back
to those if the SeekHead is corrupted, but for now, this is good
enough.
2026-01-07 00:13:32 +01:00
Zaggy1024
5aa5beed26 LibMedia: Avoid copying Matroska block data when seeking
We only need to get the frames from a block when requested by the
demuxer, so factor that out into a function that it can call when it is
outputting frames.
2026-01-05 17:53:24 -06:00
Zaggy1024
80c3f87a4a LibMedia: Calculate Matroska block data size in a prettier way 2026-01-05 17:53:24 -06:00
Zaggy1024
6d31776151 LibMedia: Return keyframes after the timestamp in slow Matroska seeks
The break upon passing the target timestamp was in the wrong places,
which resulted in us returning a keyframe that was after the timestamp
rather than before it.
2026-01-05 17:53:24 -06:00
Zaggy1024
72a5581f7b LibMedia: Seek Matroska cue-less tracks using the first track's cues
Defaulting to cues from the first track allows us to skip to any point
in a file without having to read and skip all the clusters/blocks up to
that point. This is a prerequisite to using range requests to seek in
large video files.

In order to ensure that this remains correct in cases where the first
track's cues point to a cluster containing blocks in the seeked track
with later timestamps than the seek target, the original logic is used
to start from the first block and iteratively find the closest keyframe
to the target timestamp.

In cases where the timestamp falls after the selected cue point, we
also use the iterative seeking to skip to the last keyframe that
precedes the target timestamp, which will often allow us to skip
decoding many audio blocks, since most audio codecs only store
independently-decodable blocks.
2026-01-05 17:53:24 -06:00
Zaggy1024
ea25881185 LibMedia: Reduce the size of cue points in Matroska::Reader
In the map from track to vector of cue points, we were storing
positions for each track that was included in the cues. Instead, only
store the position of the track to which the vector is assigned in the
map.
2026-01-05 17:53:24 -06:00
Zaggy1024
2e638cb5c1 LibMedia: Iterate Matroska cue track positions with structured bindings 2026-01-05 17:53:24 -06:00
Zaggy1024
a88b3e489d LibMedia: Use HashMap::ensure() to add Matroska track cue points 2026-01-05 17:53:24 -06:00
Gingeh
451177f1f4 LibMedia: Propagate errors from create_context_for_track 2026-01-02 16:19:44 +01:00
Zaggy1024
1e773942b1 LibMedia: Try to use FFmpeg for all formats other than Matroska
By sniffing specifically for MP4 and WebM, we were precluding
PlaybackManager from playing any other formats. Instead, use
MatroskaDemuxer if the media has a `matroska` or `webm` EBML doctype,
and fall back to FFmpeg for all others.

We'll need to limit the containers that FFmpeg is able to open at some
point, but for now, this allows us to play the formats we could before.
2025-12-18 17:28:14 -06:00
Zaggy1024
69ac94fcd3 LibMedia: Use Matroska parse_ebml_header for sniff_webm
Allow the function to stop reading without skipping to the end of the
data header element in order to avoid reading more data over the
network than necessary.
2025-12-18 17:28:14 -06:00
Zaggy1024
57df201ff2 LibMedia: Skip the entire EBML header after reading required elements 2025-12-18 17:28:14 -06:00
Zaggy1024
1cb7dea2fd LibMedia: Default-initialize the Matroska DocTypeVersion to 0 2025-12-18 17:28:14 -06:00
Zaggy1024
7e78ffdb86 LibMedia: Allow Matroska master element parsing to skip to the end
In a lot of cases, we're only parsing a few specific elements, then we
only need to skip unknown elements. To make this more efficient, we can
now use the ElementIterationDecision enum to specify that we should
break and jump to the end of the master element, instead of stopping in
place.
2025-12-18 17:28:14 -06:00
Aliaksandr Kalenik
f2498f1b9e LibMedia: Add MP4 and WebM media type sniffing
When media data is fully buffered, we can just try Matroska first and
fall back to FFmpeg. With incremental fetching, that approach becomes
wasteful: we may repeatedly attempt demuxer construction before enough
bytes are available, and FFmpeg in particular tends to produce noisy
logs while probing partial input.

Add lightweight container sniffing for WebM and MP4 that operates on
`IncrementallyPopulatedStream::Cursor`,
`prepare_playback_from_media_data()` now blocks until there is enough
data to decide the container type, then constructs the appropriate
demuxer directly instead of probing both.

Co-authored-by: Zaggy1024 <Zaggy1024@gmail.com>
2025-12-16 02:42:48 -06:00
Aliaksandr Kalenik
c5d8cb5c47 LibMedia: Change demuxers to use IncrementallyPopulatedStream as input
Refactor the FFmpeg and Matroska demuxers to consume data through
`IncrementallyPopulatedStream::Cursor` instead of a pointer to fully
buffered.

This change establishes a new rule: each track must be initialized with
its own cursor. Data providers now explicitly create a per-track context
via `Demuxer::create_context_for_track(track, cursor)`, and own pointer
to that cursor. In the upcoming changes, holding the cursor in the
provider would allow to signal "cancel blocking reads" so an
in-flight seek can fail immediately when a newer seek request arrives.
2025-12-16 02:42:48 -06:00
Aliaksandr Kalenik
b9db157cea LibMedia: Delete special handling for seek to 0 in MatroskaDemuxer
In the upcoming changes we will explicitly have to ask demuxer for track
context creation from provided stream consumer, so it won't be possilble
to drop track context as an optimization when seeking to 0.
2025-12-16 02:42:48 -06:00
Zaggy1024
42b19bdebf LibMedia: Store a duration in CodedFrame 2025-12-10 16:02:40 -06:00