[Ffmpeg-cvslog] r5898 - in trunk/libavcodec: dsputil.c dsputil.h i386/dsputil_mmx.c vorbis.c vorbis.h

Tue Aug 8 06:09:45 CEST 2006

On Thu, 3 Aug 2006, Benjamin Larsson wrote:

> Loren Merritt wrote:
>> On Thu, 3 Aug 2006, Benjamin Larsson wrote:
>> 
>>> If you want to optimize more you could look at the mdct pre and post 
>>> twiddle steps in mdct.c. Currently they are scalar operations. Optimizing 
>>> this would also give a gain to wma and aac.
>> 
> I forgot that ac3 also would gain from this.

If there were an ffac3 and ffaac, that is.

>> hmm, those are annoying because the data aren't contiguous.
>
> I don't understand can you elaborate?

In some of the arrays, the data for iteration k is next to the data for 
iteration k+1, and in other arrays it's next to n-k. This is fine for 
3dnow, which just loads one complex number into one mmreg. But for sse, I 
would have to unroll the loop an extra time (doing iterations k, k+1, n-k, 
n-k-1 all at once) in order to load the data efficiently.

--Loren Merritt