[FFmpeg-devel] [Patch] AAC encoder improvements

Sun May 5 02:11:27 CEST 2013

>
>> @@ -814,12 +819,12 @@ static void search_for_quantizers_twoloop(AVCodecContext *avctx,
>>                  }
>>              }
>>              if (tbits > destbits) {
>> -                for (i = 0; i < 128; i++)
>> -                    if (sce->sf_idx[i] < 218 - qstep)
>> +                for (i = 0; i < 128; i++)
>> +                    if (sce->sf_idx[i] < 218 - qstep)
>
> that looks unintended

Indeed, just whitespace

>>                          sce->sf_idx[i] += qstep;
>> -            } else {
>> -                for (i = 0; i < 128; i++)
>> -                    if (sce->sf_idx[i] > 60 - qstep)
>> +            } else if (tbits < destbits) {
>> +                for (i = 0; i < 128; i++)
>> +                    if (sce->sf_idx[i] > 60 + qstep)
>>                          sce->sf_idx[i] -= qstep;
>>              }
>>              qstep >>= 1;

This one's what matters.

>>  aaccoder.c |    4 ++--
>>  aacenc.c   |    9 +++++++++
>>  2 files changed, 11 insertions(+), 2 deletions(-)
>> 4418865c669267d4c773f7e7d0d20cd2cfe116b8  0005-aac-jointstereo.patch
>> From cf7ebf247ffca9cd2517f52b99b6922cbf3c1e3b Mon Sep 17 00:00:00 2001
>> From: Claudio Freire <klaussfreire at gmail.com>
>> Date: Sat, 4 May 2013 18:39:15 -0300
>> Subject: [PATCH 5/5] Several improvements to the AAC encoder:
>>    * After MS mode search, psy and quantization must be re-done, or the
>>      resulting quantization nosie will be ridiculously wrong.
>>    * MS cost estimation should use avg thresholds for mid channel (avg
>>      signal would result in avg thresholds per psy model), changed side
>>      channel to use min threshold. Side thresholds aren't estimable in
>>      any way other than recalculation (TODO), so min in the most
>>      conservative estimate short of a re-application of psy.
>>      Seems to work fine enough like this.
>>
>> ---
>>  libavcodec/aaccoder.c |    4 ++--
>>  libavcodec/aacenc.c   |    9 +++++++++
>>  2 files changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/libavcodec/aaccoder.c b/libavcodec/aaccoder.c
>> index 86be276..e0285bb 100644
>> --- a/libavcodec/aaccoder.c
>> +++ b/libavcodec/aaccoder.c
>> @@ -1072,7 +1072,7 @@ static void search_for_ms(AACEncContext *s, ChannelElement *cpe,
>>                      FFPsyBand *band0 = &s->psy.ch[s->cur_channel+0].psy_bands[(w+w2)*16+g];
>>                      FFPsyBand *band1 = &s->psy.ch[s->cur_channel+1].psy_bands[(w+w2)*16+g];
>>                      float minthr = FFMIN(band0->threshold, band1->threshold);
>> -                    float maxthr = FFMAX(band0->threshold, band1->threshold);
>> +                    float avgthr = 0.5f*(band0->threshold + band1->threshold);
>>                      for (i = 0; i < sce0->ics.swb_sizes[g]; i++) {
>>                          M[i] = (sce0->coeffs[start+w2*128+i]
>>                                + sce1->coeffs[start+w2*128+i]) * 0.5;
>> @@ -1100,7 +1100,7 @@ static void search_for_ms(AACEncContext *s, ChannelElement *cpe,
>>                                                  sce0->ics.swb_sizes[g],
>>                                                  sce0->sf_idx[(w+w2)*16+g],
>>                                                  sce0->band_type[(w+w2)*16+g],
>> -                                                lambda / maxthr, INFINITY, NULL);
>> +                                                lambda / avgthr, INFINITY, NULL);
>>                      dist2 += quantize_band_cost(s, S,
>>                                                  S34,
>>                                                  sce1->ics.swb_sizes[g],
>> diff --git a/libavcodec/aacenc.c b/libavcodec/aacenc.c
>> index 80dd3d8..aa93c90 100644
>> --- a/libavcodec/aacenc.c
>> +++ b/libavcodec/aacenc.c
>> @@ -621,6 +621,15 @@ static int aac_encode_frame(AVCodecContext *avctx, AVPacket *avpkt,
>>                  }
>>              }
>>              adjust_frame_information(cpe, chans);
>> +            if (cpe->ms_mode) {
>> +                /* Re-evaluate psy model and quantization selection based on
>> +                   MS-transformed channels */
>> +                s->psy.model->analyze(&s->psy, start_ch, coeffs, wi);
>> +                for (ch = 0; ch < chans; ch++) {
>> +                    s->cur_channel = start_ch * 2 + ch;
>> +                    s->coder->search_for_quantizers(avctx, s, &cpe->ch[ch], s->lambda);
>> +                }
>> +            }
>
> shouldnt this and the previous hunks be in seperate patches or is
> there some dependance ?

Well, they're related, as they both pertain to joint stereo, but
there's no hard dependency between them, that's true. I did begin with
the first hunk, feeling that those artifacts were due to bad choices,
and it's not enough on its own. The really important hunk is the
second, re-doing quantization.