This fixes a rare flake in HTMLMediaElement-load-after-decode-error,
where if the demuxer managed to get a block successfully but have its
reads in get_frames() aborted by the initial seek, it would skip the
single bad frame and never fire the error event.
PlaybackManager then intersects all enabled tracks' buffered time
ranges. This will be used by the media element for the buffered
attribute and to update the ready state.
...giving tracks a kind attribute, and renaming name to label.
Demuxers will need to determine the kind attribute, since the spec for
sourcing tracks requires us to select based on info we don't expose.
Most WebM files don't have their default duration defined, so we need
to parse the Opus frame header to determine the duration. This is
needed for buffered range calculation.
Instead of using a single track entry for all blocks in the file, use a
lookup to get the info needed to calculate the timestamp for the
specific track a block belongs to. No change in behavior for
SampleIterator, since that only returns blocks from the track that was
passed. This will be useful for MSE, since it demuxes all tracks at
once.
If we don't clear this, then a seek to a timestamp after the
presentation timestamp of the last frame will not result in us decoding
and displaying that last frame again.
If we encountered EOS while seeking forward, instead of returning the
last keyframe, we would return an error. This prevents us from decoding
the last frame if we're seeking to a timestamp after its presentation
timestamp.
The way that other classes interact with IncrementallyPopulatedStream
is now through a virtual interface MediaStream and MediaStreamCursor.
This way, we can have simpler implementations of reading media data
that will not require an RB tree and synchronization.
...and abstract away the stream/cursor blocking/aborting functionality
so that demuxers can implement or ignore those methods as they see fit.
This is a step towards implementing a wrapper demuxer for MSE streams.
We were allowing Matroska blocks with fixed-size lacing to contain
frames with non-divisible sizes. This should not be possible, as it
inherently means that trailing bytes will be discarded.
We now have a valid and invalid testcase for fixed-size lacing to
ensure our handling remains correct.
We don't actually need a Vector stack of bytes read for each element
we're reading out of a Matroska file, we already have the C++ stack
in which we can store the start and end of the master elements we're
reading.
This fixes an issue where seeks while parsing master elements would not
increment m_octets_read, so the master element could continue reading
further than intended.
This could cause a BlockGroup followed by a SimpleBlock to read as if
the BlockGroup contained the SimpleBlock, meaning that SampleIterator
would skip the SimpleBlock.
A test is added to ensure this doesn't regress again.
This was incorrectly advancing the passed-in iterator, which could
leave it at the position of a non-keyframe and cause decoding errors.
Instead, store the original iterator non-optionally, and return that
when the it is the best position for the seek target.
Pass the Streamer through parsing methods during initialization instead
of storing the cursor as a member. This makes Reader effectively
read-only after construction, improving thread safety since the cursor
is no longer shared state.
This ensures that we're using the reader for the particular thread that
the block was read from, avoiding any race conditions between seeks and
reads across threads.
This is malformed input, but it's useful for testing MSE buffers and
doesn't cause any additional issues.
It won't actually be used for MSE playback, as that will be able to
detect a format change that precedes the new initialization data
containing the new EBML and Segment elements.
The SeekPosition element specifies the positions relative to the
element ID field in the SeekHead element, not the size. The previous
method of looking them up just happened to work because all top-level
elements have 4-byte IDs.
Correcting this allows us to add a check to make sure the element that
the SeekHead points to is actually the element we're looking for.
We could potentially separate the parsed SeekHead positions from the
ones that are determined by scanning the file, so that we can fall back
to those if the SeekHead is corrupted, but for now, this is good
enough.
We only need to get the frames from a block when requested by the
demuxer, so factor that out into a function that it can call when it is
outputting frames.
The break upon passing the target timestamp was in the wrong places,
which resulted in us returning a keyframe that was after the timestamp
rather than before it.
Defaulting to cues from the first track allows us to skip to any point
in a file without having to read and skip all the clusters/blocks up to
that point. This is a prerequisite to using range requests to seek in
large video files.
In order to ensure that this remains correct in cases where the first
track's cues point to a cluster containing blocks in the seeked track
with later timestamps than the seek target, the original logic is used
to start from the first block and iteratively find the closest keyframe
to the target timestamp.
In cases where the timestamp falls after the selected cue point, we
also use the iterative seeking to skip to the last keyframe that
precedes the target timestamp, which will often allow us to skip
decoding many audio blocks, since most audio codecs only store
independently-decodable blocks.
In the map from track to vector of cue points, we were storing
positions for each track that was included in the cues. Instead, only
store the position of the track to which the vector is assigned in the
map.
By sniffing specifically for MP4 and WebM, we were precluding
PlaybackManager from playing any other formats. Instead, use
MatroskaDemuxer if the media has a `matroska` or `webm` EBML doctype,
and fall back to FFmpeg for all others.
We'll need to limit the containers that FFmpeg is able to open at some
point, but for now, this allows us to play the formats we could before.
Allow the function to stop reading without skipping to the end of the
data header element in order to avoid reading more data over the
network than necessary.
In a lot of cases, we're only parsing a few specific elements, then we
only need to skip unknown elements. To make this more efficient, we can
now use the ElementIterationDecision enum to specify that we should
break and jump to the end of the master element, instead of stopping in
place.
When media data is fully buffered, we can just try Matroska first and
fall back to FFmpeg. With incremental fetching, that approach becomes
wasteful: we may repeatedly attempt demuxer construction before enough
bytes are available, and FFmpeg in particular tends to produce noisy
logs while probing partial input.
Add lightweight container sniffing for WebM and MP4 that operates on
`IncrementallyPopulatedStream::Cursor`,
`prepare_playback_from_media_data()` now blocks until there is enough
data to decide the container type, then constructs the appropriate
demuxer directly instead of probing both.
Co-authored-by: Zaggy1024 <Zaggy1024@gmail.com>
Refactor the FFmpeg and Matroska demuxers to consume data through
`IncrementallyPopulatedStream::Cursor` instead of a pointer to fully
buffered.
This change establishes a new rule: each track must be initialized with
its own cursor. Data providers now explicitly create a per-track context
via `Demuxer::create_context_for_track(track, cursor)`, and own pointer
to that cursor. In the upcoming changes, holding the cursor in the
provider would allow to signal "cancel blocking reads" so an
in-flight seek can fail immediately when a newer seek request arrives.
In the upcoming changes we will explicitly have to ask demuxer for track
context creation from provided stream consumer, so it won't be possilble
to drop track context as an optimization when seeking to 0.