[Ffmpeg-cvslog] r5898 - in trunk/libavcodec: dsputil.c dsputil.h i386/dsputil_mmx.c vorbis.c vorbis.h
Tue Aug 8 08:48:58 CEST 2006
Loren Merritt skrev:
> On Thu, 3 Aug 2006, Benjamin Larsson wrote:
>> Loren Merritt wrote:
>>> On Thu, 3 Aug 2006, Benjamin Larsson wrote:
>>>> If you want to optimize more you could look at the mdct pre and
>>>> post twiddle steps in mdct.c. Currently they are scalar operations.
>>>> Optimizing this would also give a gain to wma and aac.
>> I forgot that ac3 also would gain from this.
> If there were an ffac3 and ffaac, that is.
Lets hope there will be.
>>> hmm, those are annoying because the data aren't contiguous.
>> I don't understand can you elaborate?
> In some of the arrays, the data for iteration k is next to the data
> for iteration k+1, and in other arrays it's next to n-k. This is fine
> for 3dnow, which just loads one complex number into one mmreg. But for
> sse, I would have to unroll the loop an extra time (doing iterations
> k, k+1, n-k, n-k-1 all at once) in order to load the data efficiently.
Ok, I understand, how about the window overlap? Would there be any
significant gain to write SIMD code for that operation? The window could
be mirrored to get rid of the n-k indexing.
> --Loren Merritt
More information about the ffmpeg-cvslog