[FFmpeg-devel] [PATCH] AAC Encoder: clipping avoidance

Sat Jul 18 01:19:17 CEST 2015

On Fri, Jul 17, 2015 at 7:36 PM, Michael Niedermayer
<michael at niedermayer.cc> wrote:
>> diff --git a/libavcodec/aacenc.c b/libavcodec/aacenc.c
>> index f05f51b..6ff95b1 100644
>> --- a/libavcodec/aacenc.c
>> +++ b/libavcodec/aacenc.c
>> @@ -46,6 +46,7 @@
>>  #include "psymodel.h"
>>
>>  #define AAC_MAX_CHANNELS 6
>> +#define CLIP_AVOIDANCE_FACTOR 0.95f
>>
>>  #define ERROR_IF(cond, ...) \
>>      if (cond) { \
>> @@ -473,7 +474,29 @@ static void encode_spectral_coeffs(AACEncContext *s, SingleChannelElement *sce)
>>                                                     sce->ics.swb_sizes[i],
>>                                                     sce->sf_idx[w*16 + i],
>>                                                     sce->band_type[w*16 + i],
>> -                                                   s->lambda);
>> +                                                   s->lambda, sce->ics.window_clipping[w]);
>> +            start += sce->ics.swb_sizes[i];
>> +        }
>> +    }
>> +}
>> +
>> +/**
>> + * Downscale spectral coefficients for near-clipping windows to avoid artifacts
>> + */
>> +static void avoid_clipping(AACEncContext *s, SingleChannelElement *sce)
>> +{
>> +    int start, i, j, w, w2;
>> +
>> +    for (w = 0; w < sce->ics.num_windows; w += sce->ics.group_len[w]) {
>> +        start = 0;
>> +        for (i = 0; i < sce->ics.max_sfb; i++) {
>> +            if (sce->ics.window_clipping[w]) {
>> +                for (w2 = w; w2 < w + sce->ics.group_len[w]; w2++) {
>> +                    float *swb_coeffs = sce->coeffs + start + w2*128;
>> +                    for (j = 0; j < sce->ics.swb_sizes[i]; j++)
>> +                        swb_coeffs[j] *= CLIP_AVOIDANCE_FACTOR;
>> +                }
>> +            }
>
> wouldnt it be better to transition smoothly instead of a hard
> *0.95  vs. *1 ?

If you mean adjusting CLIP_AVOIDANCE_FACTOR to be the minimal factor
that prevents clipping, it probably would, but it would be rather
hard.

It would imply first measuring quantization error (ie: decoding the
encoded bitstream, many things add up that could cause clipping, M/S,
I/S, TNS, PNS, etc... so it cannot be done with any less work),
computing the necessary attenuation to avoid clipping, re-coding, and
then crossing fingers that the new frame won't clip (there's no
guarantee that quantization noise will be proportional). I wouldn't go
through all that trouble unless it guaranteed clip-free decoding.

If you mean a transition in time, I don't think it makes any
difference. 0.95 is a ~0.5db change in intensity, which ought to be
inaudible, and windowing will already take care to make the transition
smooth. And the logic wouldn't be completely free either to ramp
gradually, as it would have to ramp fully to 0.95 by the time it
reaches the first window marked as clipping hazard, and it could very
well be the frist window.

In all my tests I haven't noticed the interface between attenuated and
non-attenuated, so I don't believe it's worth the hassle. But if
there's a sample that exhibits it, I'll be glad to attempt it.