[FFmpeg-devel] [PATCH 2/2] SSE optimized mp3 windowing

Vitor Sessak vitor1001
Thu Jun 17 22:56:36 CEST 2010


On 06/17/2010 09:56 PM, Loren Merritt wrote:
> On Thu, 17 Jun 2010, Vitor Sessak wrote:
>
>> + "movaps (%0,%5), %%xmm1 \n\t"
>> + "movaps (%2,%5), %%xmm2 \n\t"
>> + "movaps (%1,%5), %%xmm3 \n\t"
>
> One of these can be a memory arg to mulps.

Already addressed in my latest patch in the same thread.

>> + "mulps %%xmm2, %%xmm1 \n\t"
>> + "subps %%xmm1, %%xmm0 \n\t"
>> + "mulps %%xmm2, %%xmm3 \n\t"
>> + "subps %%xmm3, %%xmm4 \n\t"
>> [repeated lots of times]
>
> Looks like a place for a macro.

Good point. Used macros also for the other block.

>> + if (incr == 1) {
>
> Does output really need to be interleaved?

It's a known TODO to allow codecs to outpu some kind of 
SAMPLE_FMT_PLANAR_FLOAT.

>> + "movups 52(%4), %%xmm0 \n\t"
>> + "shufps $0x1b, %%xmm0, %%xmm0 \n\t"
>> + "movaps (%1), %%xmm1 \n\t"
>
> memory arg

Fixed

>> + "subps %%xmm1, %%xmm0 \n\t"
>> + "movaps %%xmm0, (%0) \n\t"
>> +
>> + "movups 4(%3), %%xmm0 \n\t"
>> + "movaps 48(%2), %%xmm1 \n\t"
>> + "shufps $0x1b, %%xmm0, %%xmm0 \n\t"
>> + "addps %%xmm1, %%xmm0 \n\t"
>> + "movaps %%xmm0, 112(%0) \n\t"
>
> Why do you alternate between two schedules?

No good reason, fixed.

New patch attached.

-Vitor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mp3_dspfy5.diff
Type: text/x-patch
Size: 8050 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100617/2709f79c/attachment.bin>



More information about the ffmpeg-devel mailing list