libavcodec apparently holds onto any error that is not AVERROR_EOF when
a read fails. This means that reading until EOF after an aborted read
results in us receiving an AVERROR_EXIT in FFmpegDemuxer instead of
AVERROR_EOF, which causes the playback system to enter an error state
without decoding all frames in the file.
Instead, just always return AVERROR_EOF, and check if the read was
aborted in FFmpegDemuxer instead to return the correct error category
from there.
For web audio, I reckon an occasional misjudged channel layout is
better than more frequent exceptions.
Signed PCM is normalized with unsigned max divided by 2, not
signed max. If you divide by the signed max (32767), you get headroom
that can exceed the threshold below -1.0. It's not audible, this mostly
matters for tests that assume correct normalization. But it turns out
there's no shortage of "golden ears" jackholes out there who swear they
can hear the difference.
The way that other classes interact with IncrementallyPopulatedStream
is now through a virtual interface MediaStream and MediaStreamCursor.
This way, we can have simpler implementations of reading media data
that will not require an RB tree and synchronization.
...and abstract away the stream/cursor blocking/aborting functionality
so that demuxers can implement or ignore those methods as they see fit.
This is a step towards implementing a wrapper demuxer for MSE streams.
It's not necessary to keep around an instance of AVFormatContext in
FFmpegDemuxer, so instead just copy out the info we need for our
implementation and then destroy it so that our stream cursor is freed.
This saves us from having our own color conversion code, which was
taking up a fair amount of time in VideoDataProvider. With this change,
we should be able to play high resolution videos without interruptions
on machines where the CPU can keep up with decoding.
In order to make this change, ImmutableBitmap is now able to be
constructed with YUV data instead of an RBG bitmap. It holds onto a
YUVData instance that stores the buffers of image data, since Skia
itself doesn't take ownership of them.
In order to support greater than 8 bits of color depth, we normalize
the 10- or 12-bit color values into a 16-bit range.
When a seek is requested while a previous seek is still blocked waiting
for not yet available bytes, we want to abandon the old request
immediately and start processing the new one.
Refactor the FFmpeg and Matroska demuxers to consume data through
`IncrementallyPopulatedStream::Cursor` instead of a pointer to fully
buffered.
This change establishes a new rule: each track must be initialized with
its own cursor. Data providers now explicitly create a per-track context
via `Demuxer::create_context_for_track(track, cursor)`, and own pointer
to that cursor. In the upcoming changes, holding the cursor in the
provider would allow to signal "cancel blocking reads" so an
in-flight seek can fail immediately when a newer seek request arrives.
Audio blocks now contain a sample specification with the sample rate
and channel map for the audio data they contain. This will facilitate
conversion from one sample specification to another in order to allow
playback on devices with more or less speakers than the audio data
contains.
By default, MatroskaDemuxer chooses not to seek if the current frame
is closer to the seek target than the keyframe that precedes the seek
target. However, it can be desirable to seek to a keyframe anyway, so
let's allow that.
This commit implements the functionality to play back audio through
PlaybackManager.
To decode the audio data, AudioDataProviders are created for each track
in the provided media data. These providers will fill their audio block
queue, then sit idle until their corresponding tracks are enabled.
In order to output the audio, one AudioMixingSink is created which
manages a PlaybackStream which requests audio blocks from multiple
AudioDataProviders and mixes them into one buffer with sample-perfect
precision.
Very minor change, which doesn't actually affect our output, since we
were already inputting and outputting microseconds, but it can't hurt
to give FFmpeg's decoder this information as well.
Most users will only care about the total file duration, and shouldn't
be required to determine the file duration from multiple track
durations. To facilitate that, add a total_duration() function that
returns the demuxer's duration not associated to any particular track.
The existing conversion was rounding to the nearest millisecond, which
is much less precision than most videos will want. Instead, use only
integer math to directly convert the presentation time to seconds and
nanoseconds for our AK::Duration to represent accurately.