[FFmpeg-trac] #9352(swresample:new): swresample can introduce significant audio distortion

FFmpeg trac at avcodec.org
Fri Jul 30 19:56:34 EEST 2021


#9352: swresample can introduce significant audio distortion
-------------------------------------+-------------------------------------
             Reporter:  Gregory      |                    Owner:  (none)
  Beauregard                         |
                 Type:  defect       |                   Status:  new
             Priority:  normal       |                Component:
                                     |  swresample
              Version:  git-master   |               Resolution:
             Keywords:               |               Blocked By:
  loudnorm,ebur128,swresample        |
             Blocking:               |  Reproduced by developer:  0
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------
Description changed by Gregory Beauregard:

Old description:

> UPDATE: see bottom, swresample is at fault
>
> Summary of the bug: The loudnorm filter can badly goof the loudness
> measurement in certain situations (e.g. particular inaudible noise after
> a resample) resulting in significant distortion if attempting to use it
> in dynamic mode or to get measurements.
>
> ffmpeg used is git master as of 2021-07-29.
>
> Using the reproducer file at
> [https://stream.gably.net/images/loudnorm_samp.mkv], `loudnorm_samp.mkv`
> (4.7 MB), 48 kHz DTS-HD, analyze the loudness with loudnorm to get
> integrated loudness ~-21: `"input_i" : "-21.57"` (output of this command
> attached):
> {{{
> ffmpeg -i loudnorm_samp.mkv -af
> aresample=ocl=stereo,loudnorm=print_format=json -f null -
> }}}
>
> However, when we reanalyze it by giving the resampler a dither and
> specifying 48 kHz, the loudnorm filter goofs the integrated loudness
> measurement with 0.13: "input_i" : "0.13"`:
> {{{
> ffmpeg -i loudnorm_samp.mkv -af
> aresample=ocl=stereo:dither_method=shibata:osr=48000,loudnorm=print_format=json
> -f null -
> }}}
>
> This results in really wrong re-normalization resulting in significant
> distortion. If we run `ebur128` to analyze the loudness instead of
> `loudnorm` in the situation where it goofs, `ebur128` outputs an expected
> -22.2 integrated loudness, and doesn't change between the two situations
> above:
> {{{
> ffmpeg -i loudnorm_samp.mkv -af
> aresample=ocl=stereo:dither_method=shibata:osr=48000,ebur128 -f null -
> }}}
>
> So it seems like there's some odd sensitivity in the `loudnorm` filter to
> certain inaudible noise that's messing it up where `ebur128` is still ok.
>
> Update: You can use the following two `ebur128` filter runs similar to
> the above with and without inserting a `,aformat=r=192000`, resampling to
> 192 kHz as the `loudnorm` filter does unconditionally internally (but
> `ebur128` does not). I've attached the outputs as `192ebur.txt` and
> `48ebur.txt`.
>
> {{{
> ffmpeg -i loudnorm_samp.mkv -af
> aresample=ocl=stereo:dither_method=shibata:osr=48000,aformat=r=192000,ebur128
> -f null - > 192ebur.txt 2>&1
> }}}
> {{{
> ffmpeg -i loudnorm_samp.mkv -af
> aresample=ocl=stereo:dither_method=shibata:osr=48000,ebur128 -f null - >
> 48ebur.txt 2>&1
> }}}
>
> In the 192 kHz resample the `ebur128` filter wrongly measures the
> integrated loudness to `I:          -0.1 LUFS`, similar to the wrong
> `loudnorm` measurement, but the 48 kHz case measures -22.2 as you'd
> expect. My understanding is the two filters share some measurement code,
> so presumably ffmpeg's measurement code is broken at 192 kHz.
>
> UPDATE 2:
> The internal 192 kHz resample in `loudnorm` hid the real culprit here,
> swresample significantly distorting the audio itself.
>
> Listen to the sample with just the format conversions to see it has been
> significantly distorted:
> {{{
> ffplay -i loudnorm_samp.mkv -af
> aresample=ocl=stereo:dither_method=shibata:osr=48000,aformat=r=192000
> }}}
>
> We can also produce the significant distortion with just one resample
> with a `s32` sample format:
> {{{
> ffplay -i loudnorm_samp.mkv -af
> aresample=ocl=stereo:dither_method=shibata:osr=48000:osf=s32
> }}}
>
> Setting `resampler=soxr` does not fix the massive distortion.

New description:

 UPDATE: see bottom, swresample is at fault

 Summary of the bug: The loudnorm filter can badly goof the loudness
 measurement in certain situations (e.g. particular inaudible noise after a
 resample) resulting in significant distortion if attempting to use it in
 dynamic mode or to get measurements.

 ffmpeg used is git master as of 2021-07-29.

 Using the reproducer file at
 [https://stream.gably.net/images/loudnorm_samp.mkv], `loudnorm_samp.mkv`
 (4.7 MB), 48 kHz DTS-HD, analyze the loudness with loudnorm to get
 integrated loudness ~-21: `"input_i" : "-21.57"` (output of this command
 attached):
 {{{
 ffmpeg -i loudnorm_samp.mkv -af
 aresample=ocl=stereo,loudnorm=print_format=json -f null -
 }}}

 However, when we reanalyze it by giving the resampler a dither and
 specifying 48 kHz, the loudnorm filter goofs the integrated loudness
 measurement with 0.13: "input_i" : "0.13"`:
 {{{
 ffmpeg -i loudnorm_samp.mkv -af
 aresample=ocl=stereo:dither_method=shibata:osr=48000,loudnorm=print_format=json
 -f null -
 }}}

 This results in really wrong re-normalization resulting in significant
 distortion. If we run `ebur128` to analyze the loudness instead of
 `loudnorm` in the situation where it goofs, `ebur128` outputs an expected
 -22.2 integrated loudness, and doesn't change between the two situations
 above:
 {{{
 ffmpeg -i loudnorm_samp.mkv -af
 aresample=ocl=stereo:dither_method=shibata:osr=48000,ebur128 -f null -
 }}}

 So it seems like there's some odd sensitivity in the `loudnorm` filter to
 certain inaudible noise that's messing it up where `ebur128` is still ok.

 Update: You can use the following two `ebur128` filter runs similar to the
 above with and without inserting a `,aformat=r=192000`, resampling to 192
 kHz as the `loudnorm` filter does unconditionally internally (but
 `ebur128` does not). I've attached the outputs as `192ebur.txt` and
 `48ebur.txt`.

 {{{
 ffmpeg -i loudnorm_samp.mkv -af
 aresample=ocl=stereo:dither_method=shibata:osr=48000,aformat=r=192000,ebur128
 -f null - > 192ebur.txt 2>&1
 }}}
 {{{
 ffmpeg -i loudnorm_samp.mkv -af
 aresample=ocl=stereo:dither_method=shibata:osr=48000,ebur128 -f null - >
 48ebur.txt 2>&1
 }}}

 In the 192 kHz resample the `ebur128` filter wrongly measures the
 integrated loudness to `I:          -0.1 LUFS`, similar to the wrong
 `loudnorm` measurement, but the 48 kHz case measures -22.2 as you'd
 expect. My understanding is the two filters share some measurement code,
 so presumably ffmpeg's measurement code is broken at 192 kHz.

 UPDATE 2:
 The internal 192 kHz resample in `loudnorm` hid the real culprit here,
 swresample significantly distorting the audio itself.

 Listen to the sample with just the format conversions to see it has been
 significantly distorted:
 {{{
 ffplay -i loudnorm_samp.mkv -af
 aresample=ocl=stereo:dither_method=shibata:osr=48000,aformat=r=192000
 }}}

 The audio is not distorted if we don't add the `aformat=r=192000` to the
 above; however, we can also reproduce the significant distortion with just
 one resample if we specify `s32` sample format:
 {{{
 ffplay -i loudnorm_samp.mkv -af
 aresample=ocl=stereo:dither_method=shibata:osr=48000:osf=s32
 }}}

 Setting `resampler=soxr` does not fix the distortion.

--
-- 
Ticket URL: <https://trac.ffmpeg.org/ticket/9352#comment:17>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker


More information about the FFmpeg-trac mailing list