[FFmpeg-devel] [RFC] AAC Encoder

Michael Niedermayer michaelni
Mon Aug 18 13:03:18 CEST 2008


On Mon, Aug 18, 2008 at 09:57:50AM +0300, Kostya wrote:
> On Sun, Aug 17, 2008 at 02:59:01PM +0200, Michael Niedermayer wrote:
> > On Sun, Aug 17, 2008 at 02:40:19PM +0300, Kostya wrote:
> [...]
> > > > anyway iam not arguing for the encoder to do exactly what the decoder does ATM
> > > > but it must be clean, compact, optimal and decoder and encoder should match
> > > > each other whenever possible.
> > > > 
> > > > Could you maybe point to exactly what code would become uglier with the
> > > > style used by the decoder? ideally with a diff showing the uglification?
> > > 
> > > they are almost the same to the extent decoder knows all limits and can
> > > exploit it and my encoder have data for all windows and just marks window
> > > group starts.
> > 
> > can the decoder be changed to use the system of the encoder?
> > I really would prefer if both where using the same system ...
> > also [w*16 + i] can be written as [w][i] which is a little cleaner
> 
> we have either up to 51 scalefactor bands in one window or
> up to 15 scalefactor bands in 8 windows, so flat array is 4 times smaller
> than of two-dimensional one

affraid of sf[0][50] in a uint8_t sf[8][16] array?
Iam frequently doing such things and so far it has neither been a problem in
practice nor has any language lawyer sued me.


>  
> [...]
> > > > [...]
> > > > > +/**
> > > > > + * Encode scalefactors.
> > > > > + */
> > > > > +static void encode_scale_factors(AVCodecContext *avctx, AACEncContext *s, ChannelElement *cpe, int channel, int global_gain)
> > > > > +{
> > > > > +    int off = global_gain, diff;
> > > > > +    int i, w, wg;
> > > > > +
> > > > > +    w = 0;
> > > > > +    for(wg = 0; wg < cpe->ch[channel].ics.num_window_groups; wg++){
> > > > > +        for(i = 0; i < cpe->ch[channel].ics.max_sfb; i++){
> > > > > +            if(!cpe->ch[channel].zeroes[w*16 + i]){
> > > > > +                /* if we have encountered scale=256 it means empty band
> > > > > +                 * which was decided to be coded by encoder, so assign it
> > > > > +                 * last scalefactor value for compression efficiency
> > > > > +                 */
> > > > > +                if(cpe->ch[channel].sf_idx[w*16 + i] == 256)
> > > > > +                    cpe->ch[channel].sf_idx[w*16 + i] = off;
> > > > 
> > > > why is th code that selects scalefactors not simply setting it to the last
> > > > scale factor?
> > > 
> > > it may occur before first band with scale too 
> > 
> > for(i=end; i>=0; i--)
> >     if(sf[i] == 256) sf[i]=last;
> >     else             last= sf[i];
> > 
> > as a sideeffect s[0] will contain the "global_gain" which ive seen some
> > complicated code to find because the firsts may be 256
>  
> what about the last bands then? They are usually empty and encoder
> probably can choose to encode one in the run too.

there could be a second loop, anyway, you can do it anyway you like as
long as its the simplest&cleanest solution (even the with 256 but my
feeling says this isnt the simplest way)


>  
> [...]
> > > +    int wg;
> > >  
> > >      put_bits(&s->pb, 1, 0);                // ics_reserved bit
> > >      put_bits(&s->pb, 2, info->window_sequence[0]);
> > > @@ -248,15 +334,307 @@
> > >          put_bits(&s->pb, 1, 0);            // no prediction
> > >      }else{
> > >          put_bits(&s->pb, 4, info->max_sfb);
> > > -        for(i = 1; i < info->num_windows; i++)
> > > -            put_bits(&s->pb, 1, info->group_len[i]);
> > > +        for(wg = 0; wg < info->num_window_groups; wg++){
> > > +            if(wg)
> > > +                put_bits(&s->pb, 1, 0);
> > > +            if(info->group_len[wg] > 1)
> > > +                put_sbits(&s->pb, info->group_len[wg] - 1, 0xFF);
> > > +        }
> > > +    }
> > > +}
> > 
> > is the change in group_len content leading to an overall simplifiation?
> > Because it makes the code distinctly worse here
> > Also the way i understood you was that you didnt want arrays compacted
> > like that
>  
> there are two approaches - store bit mask that indicates window belonging
> to the group or starting new group, and storing group lengths explicitly.
> 
> First approach gives a lot of if(group_len[wg]) continue and simplier put_ics_info(),
> second one makes put_ics_info() uglier but better handle group lengths.

the code was
for(i = 1; i < info->num_windows; i++)
    put_bits(&s->pb, 1, info->group_len[i]);

why can it not be
for(i = 1; i < info->num_windows; i++)
    put_bits(&s->pb, 1, !!info->group_len[i]);

?

group_len here can be the actual lens

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Many that live deserve death. And some that die deserve life. Can you give
it to them? Then do not be too eager to deal out death in judgement. For
even the very wise cannot see all ends. -- Gandalf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080818/93608a69/attachment.pgp>



More information about the ffmpeg-devel mailing list