[Ffmpeg-cvslog] r5898 - in trunk/libavcodec: dsputil.c dsputil.h i386/dsputil_mmx.c vorbis.c vorbis.h
Tue Aug 8 06:09:45 CEST 2006
On Thu, 3 Aug 2006, Benjamin Larsson wrote:
> Loren Merritt wrote:
>> On Thu, 3 Aug 2006, Benjamin Larsson wrote:
>>> If you want to optimize more you could look at the mdct pre and post
>>> twiddle steps in mdct.c. Currently they are scalar operations. Optimizing
>>> this would also give a gain to wma and aac.
> I forgot that ac3 also would gain from this.
If there were an ffac3 and ffaac, that is.
>> hmm, those are annoying because the data aren't contiguous.
> I don't understand can you elaborate?
In some of the arrays, the data for iteration k is next to the data for
iteration k+1, and in other arrays it's next to n-k. This is fine for
3dnow, which just loads one complex number into one mmreg. But for sse, I
would have to unroll the loop an extra time (doing iterations k, k+1, n-k,
n-k-1 all at once) in order to load the data efficiently.
More information about the ffmpeg-cvslog