[FFmpeg-devel] [RFC] snow SSE2 optimizations (was: Re: [FFmpeg-cvslog] r10223 - in trunk/libavcodec/i386: dsputil_mmx.c snowdsp_mmx.c)

Thu Aug 30 13:55:06 CEST 2007

Hello,

And I do have a little asm question:

On Tue, Aug 28, 2007 at 12:07:02AM +0200, Reimar D?ffinger wrote:
>         "packuswb %%xmm4, %%xmm0                 \n\t"
>         "movq   %%xmm0, (%%"REG_d")              \n\t"
>         "movhpd %%xmm0, (%%"REG_d",%%"REG_c")    \n\t"

As I understand the documentation this instruction does nothing float
specific. But that would mean that movhps does exactly the same - but it
has a different opcode (one byte smaller!).
Can someone explain that to me? I guess it makes more sense to just use
movhps? Or should I avoid these completely and use a second packuswb
like the old code did? Or something completely different?

Greetings,
Reimar D?ffinger