[FFmpeg-devel] [RFC] AAC Encoder, now more optimal

Michael Niedermayer michaelni
Sat Sep 6 18:17:38 CEST 2008


On Sat, Sep 06, 2008 at 06:36:16PM +0300, Kostya wrote:
> On Sat, Sep 06, 2008 at 03:13:54AM +0200, Michael Niedermayer wrote:
> > On Fri, Sep 05, 2008 at 04:13:58PM +0300, Kostya wrote:
> [...]
> > ok, so only optimization related comments
> > 
> > [...]
> > 
> > > /**
> > >  * Encode band info for single window group bands.
> > >  */
> > > static void encode_window_bands_info(AACEncContext *s, SingleChannelElement *sce,
> > >                                      int win, int group_len)
> > > {
> > >     BandCodingPath path[64];
> > >     int band_bits[64][12];
> > >     int w, swb, cb, start, start2, size;
> > >     int i, j;
> > >     const int max_sfb = sce->ics.max_sfb;
> > >     const int run_bits = sce->ics.num_windows == 1 ? 5 : 3;
> > >     const int run_esc = (1 << run_bits) - 1;
> > >     int bits, sbits, idx, count;
> > >     int stack[64], stack_len;
> > > 
> > >     start = win*128;
> > >     for(swb = 0; swb < max_sfb; swb++){
> > >         int maxval = 0;
> > >         start2 = start;
> > >         size = sce->ics.swb_sizes[swb];
> > >         if(sce->zeroes[win*16 + swb]){
> > >             maxval = 0;
> > >         }else{
> > >             for(w = 0; w < group_len; w++){
> > >                 for(i = start2; i < start2 + size; i++){
> > >                     maxval = FFMAX(maxval, FFABS(sce->icoefs[i]));
> > >                 }
> > >                 start2 += 128;
> > >             }
> > >         }
> > >         sbits = calculate_band_sign_bits(s, sce, group_len, start, size);
> > >         for(cb = 0; cb < 12; cb++){
> > >             if(aac_cb_info[cb].maxval < maxval){
> > >                 band_bits[swb][cb] = INT_MAX;
> > >             }else{
> > >                 band_bits[swb][cb] = calculate_band_bits(s, sce, group_len, start, size, cb);
> > >                 if(IS_CODEBOOK_UNSIGNED(cb-1)){
> > >                     band_bits[swb][cb] += sbits;
> > >                 }
> > >             }
> > >         }
> > >         start += sce->ics.swb_sizes[swb];
> > >     }
> > >     path[0].bits = 0;
> > >     for(i = 1; i <= max_sfb; i++)
> > >         path[i].bits = INT_MAX;
> > 
> > >     for(i = 0; i < max_sfb; i++){
> > >         for(cb = 0; cb < 12; cb++){
> > >             int sum = 0;
> > >             for(j = 1; j <= max_sfb - i; j++){
> > >                 if(band_bits[i+j-1][cb] == INT_MAX)
> > >                     break;
> > >                 sum += band_bits[i+j-1][cb];
> > >                 bits = sum + path[i].bits + run_value_bits[sce->ics.num_windows == 8][j];
> > >                 if(bits < path[i+j].bits){
> > >                     path[i+j].bits     = bits;
> > >                     path[i+j].codebook = cb;
> > >                     path[i+j].prev_idx = i;
> > >                 }
> > >             }
> > >         }
> > >     }
> > 
> > below should be faster, but _very_ important, check that it still finds the
> > global optimum, it should. That said, always check all optimizations you do
> > to ensure they do not change anything unexpected. debuging such things later
> > can be very time consuming.
> > 
> > for(i = 0; i < max_sfb; i++){
> >     for(cb = 0; cb < 12; cb++){
> >         int r= last[cb].run;
> >         if(run_value_bits[r] == run_value_bits[r+1]){
> >             last[cb].bits += band_bits[i][cb];
> >             last[cb].run++;
> >             bits = band_bits[i][cb] + path[i-1].bits + run_value_bits[1];
> >             if(bits < last[cb].bits){
> >                 last[cb].bits= bits;
> >                 last[cb].run= run;
> >             }
> >         }else{
> >             int sum = 0;
> >             last[cb].bits = INT_MAX;
> >             for(run = 1; run<=i+1; run++){
> >                 sum += band_bits[i-run+1][cb];
> >                 bits = sum + path[i-run].bits + run_value_bits[run];
> >                 if(bits < last[cb].bits){
> >                     last[cb].bits= bits;
> >                     last[cb].run= run;
> >                 }
> >             }
> >         }
> >         if(last[cb].bits < path[i].bits){
> >             path[i].bits= last[cb].bits;
> >             path[i].codebook= cb;
> >             path[i].run= last[cb].run;
> >         }
> >     }
> > }
> > 
> > also this should be working with rate distortion values not just bits.
> 
> IMO it should encode losslessly at this stage, so no rate distortion.

How much quality do we loose from this?
And why exactly should the RD values not be used?!


> 
> While this seems a bit faster it's significantly worse - on my test sample it's
> 222809 bytes against 209792 with current code. 

Well it was just showing how to optimize the code, its untested and not
unsurprissing that it contains (a) bug(s).
After all you where asking for optimization tips, not for others to do the
optimizations.
Does the code work if the if() is disabled by && 0 ? it should be completely
identical to the old code without the if()

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

It is dangerous to be right in matters on which the established authorities
are wrong. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080906/60fcab35/attachment.pgp>



More information about the ffmpeg-devel mailing list