[FFmpeg-devel] [PATCH] avcodec/aacdec: don't force HE-AACv2 profile if no PS info is present
Alex Converse
alex.converse at gmail.com
Sat Jul 23 02:00:53 EEST 2022
On Fri, Jul 22, 2022 at 8:37 AM James Almer <jamrial at gmail.com> wrote:
> On 7/22/2022 12:14 PM, Andreas Rheinhardt wrote:
> > James Almer:
> >> On 7/22/2022 11:56 AM, Andreas Rheinhardt wrote:
> >>> James Almer:
> >>>> On 7/22/2022 11:23 AM, Andreas Rheinhardt wrote:
> >>>>> James Almer:
> >>>>>> On 7/18/2022 10:57 AM, Andreas Rheinhardt wrote:
> >>>>>>> James Almer:
> >>>>>>>> On 7/14/2022 9:10 AM, Andreas Rheinhardt wrote:
> >>>>>>>>> James Almer:
> >>>>>>>>>> Should fix ticket #3361
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: James Almer <jamrial at gmail.com>
> >>>>>>>>>> ---
> >>>>>>>>>> This also needs an update to some fate ref samples i'll upload
> >>>>>>>>>> before
> >>>>>>>>>> pushing
> >>>>>>>>>> (fate-aac-al_sbr_ps_04_ur and fate-aac-al_sbr_ps_06_ur which
> >>>>>>>>>> are now
> >>>>>>>>>> decoded
> >>>>>>>>>> properly as he_aac mono, so the .s16 files need to be replaced).
> >>>>>>>>>>
> >>>>>>>>>
[snip]
> >>>>>>
> >>>>>>
> >>>>>> it seems at least for these samples the fixed decoder does not
> >>>>>> generate
> >>>>>> a decoded stream comparable to the float one, so I'll just upload a
> >>>>>> new
> >>>>>> raw pcm file.
> >>>>>
> >>>>> When I decode both of these streams with git master, the left
> >>>>> channel is
> >>>>> pretty much identical, yet the right channel of the fixed-point decoder
> >>>>> is silent and the right channel of the floating point decoder is not.
> >>>>> With this patch applied, the result are two mono streams that are
> >>>>> pretty
> >>>>> much identical: The test sample created by the floating-point decoder
> >>>>> works with the fixed-point decoder test (if one uncomments and modifies
> >>>>> the latter). So the issue with aac-al_sbr_ps_06_ur is not a reason to
> >>>>> upload new samples.
> >>>>
> >>>> Ok, can you suggest how to add a test that decodes with the fixed point
> >>>> decoder then compares that with the output of the float decoder? Is
> >>>> there a helper in fate.sh already for this?
> >>>>
> >>>
> >>> There is currently no helper in fate-run.sh for this.
> >>
> >> Yeah, figures that's the case. Can you suggest how one would work for this?
> >>
> >
> > A new function in fate-run.sh seems to be necessary. Given that a
> > similar situation exists for AC-3 we should not hardcode aac; instead it
> > should have two parameters for the float and the fixed decoder. Then it
> > should decode the input file twice and do the same comparison that the
> > current tests use (they use the oneoff method, which results in
> > do_tiny_psnr with MAXDIFF being called).
> > I think the tests for the fixed-point decoder (with checksums) should be
> > separate, so that one can test this without the floating-point decoder
> > being present.
> >
> >>>
> >>>>>
> >>>>> - Andreas
> >>>>>
> >>>>> PS: libfdk-aac produces a file that looks pretty much like the floating
> >>>>> point decoder from git master. Are you sure your patch is correct?
> >>>>
> >>>> Yes, they duplicate the single channel in the stream and output it as
> >>>> stereo, something that should be done by a filter if that's what the
> >>>> user wants. Decoding a mono sample should generate a mono stream.
> >>>
> >>> Not really. The channels are different.
> >>
> >> ./ffmpeg -c:a libfdk_aac -i ../samples/aac/al_sbr_ps_04_new.mp4 -af
> >> channelmap=channel_layout=mono:map=0 -f md5 -
> >>
> >> has the same result as
> >>
> >> ./ffmpeg -c:a libfdk_aac -i ../samples/aac/al_sbr_ps_04_new.mp4 -af
> >> channelmap=channel_layout=mono:map=1 -f md5 -
> >>
> >> Same with the samples in the ticket.
> >>
> >
> > This seems to be true for al_sbr_ps_04_new.mp4; but it is not true for
> > al_sbr_ps_06_new.mp4.
>
> So looks like nearly a hundred samples into al_sbr_ps_06_new.mp4 frames
> start containing PS info. With this patch the decoder properly decodes
> the first hundred as mono, but since the decoder is locked, it will keep
> decoding the stereo samples as mono.
>
Hey all,
I thought I should share a little bit of context about this problem,
but I don't mean to come back from nowhere and try to overrule you
all. Do what you decide is best.
An HE-AACv2 decoder treating unsignaled mono as stereo is an
intentional design trade-off that the MPEG audio committee made. It is
a tradeoff that the FFmpeg decoder has mimicked for a number of years.
If you want to revisit the trade-off (and there may very well be good
reasons to do that) that's fine, but I think treating the current
behavior as a "bug" is the wrong approach.
In fact, those fate tests are based on a Coding Technologies test
suite designed to validate a decoder conformed to the MPEG behavior.
Parametric Stereo is nested inside the SBR extension after the main
SBR payload which itself is nested inside the AAC raw data block after
the main channel elements. It takes a full bitstream parse of both the
AAC and SBR layers and finding an SBR intra-frame to even see if PS is
present.
As for why I think MPEG made this trade-off (my opinion of why they did this) :
- It enables cutting (or joining a stream of) audio at arbitrary
frames without losing PS content or stereo detection.
- Most devices with mono output can support downmixing from stereo.
- Down mixing stereo to mono can be ugly with regard to phase but it
doesn't have nearly the complexity of the taste/environment/judgment
factor that surround sound to stereo downmix does.
- In transcode scenarios, most output codecs can support encoding
stereo formatted audio where both channels are identical quite
efficiently (even in AAC-LC with mid-side coding the overhead for
mono-coded as stereo is a single digit number of bits per frame).
- On playback on a stereo device it doesn't matter if the decoded
signal is one channel played out of both speakers or two channels that
are identical.
- This behavior can be overridden with more complicated signaling the
extradata (but this requires a transport that supports such signaling
and doesn't simply wrap ADTS).
- The folks working on iterating "MPEG-2 NBC" into the "MPEG-4 Audio"
monstrosity were primarily focused on getting their new features used.
More information about the ffmpeg-devel
mailing list