[FFmpeg-devel] [PATCH] AAC: unroll parts of decode_spectrum_and_dequant()

Alex Converse alex.converse
Tue Dec 9 06:08:14 CET 2008


On Mon, Dec 8, 2008 at 10:58 PM, Jason Garrett-Glaser <darkshikari at gmail.com
> wrote:

> On Mon, Dec 8, 2008 at 7:34 PM, Alex Converse <alex.converse at gmail.com>
> wrote:
> > On Mon, Dec 8, 2008 at 9:33 PM, Jason Garrett-Glaser
> > <darkshikari at gmail.com>wrote:
> >
> >> On Mon, Dec 8, 2008 at 3:43 PM, Alex Converse <alex.converse at gmail.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > The attached patch unrolling sections of decode spectrum saves me
> 5.48%
> >> on
> >> > my mpeg4-lc-256kbps stream on my core2 duo.
> >> >
> >> > Regards,
> >> > Alex Converse
> >>
> >> If dim can only be 2 or 4, wouldn't it be better to do
> >>
> >> if( dim == 4 ) {
> >> do dim 4 stuff
> >> }
> >> do dim 2 stuff
> >>
> >> The switch seems unnecessary.
> >>
> >
> > Idiomatically I like the switch better but your way is faster. When I did
> > that I also tried reverting access back to forward order and got a slight
> > speed up. This way made the unsigned loop just like the other three, so I
> > added that one for another benchmarked verified speed up.
> >
> > The net gain is a 12% decrease in cycles over the original vs 5% before.
>
> if (vq_ptr[2]) coef[coef_tmp_idx + 2] = 1 - 2*(int)get_bits1(gb);
> if (vq_ptr[3]) coef[coef_tmp_idx + 3] = 1 - 2*(int)get_bits1(gb);
>
> Isn't that a rather unnecessary int -> float conversion?  I'd think
> you could do much better than that considering there are only two
> possible input values...
>

In principle I agree. In practice is that a non germane change that needs to
be in a segregated patch? I had an improvement for ESC_BT that I tossed in a
separate patch.

--Alex




More information about the ffmpeg-devel mailing list