[Ffmpeg-devel] [RFC] svq1 very slow encoding

Sat Mar 31 22:34:48 CEST 2007

On Sat, 31 Mar 2007, Trent Piepho wrote:
> On Sat, 31 Mar 2007, Loren Merritt wrote:
>> On Fri, 30 Mar 2007, Trent Piepho wrote:
>>
>>> On x86-64, could "int sum" be put in a 64-bit register?  Which would
>>> generate something like "movd %mm4, %rax".  Don't have a 64-bit system, but
>>> can you use movd with a 64-bit general purpose register?  If you can, isn't
>>> it still wrong, since %rax will have garbage in the top 32 bits?
>>
>> int is 32bit, and the register name generated by a bare %1 is the
>> same size as the value or variable it's associated with.
>
> gcc will use 32-bit registers for 16 or 8 bit variables.

void foo()
{
     uint8_t  a;
     uint16_t b;
     uint32_t c;
     uint64_t d;
     asm volatile(
         "mov $0, %0 \n"
         "mov $1, %1 \n"
         "mov $2, %2 \n"
         "mov $3, %3 \n"
         :"=r"(a), "=r"(b), "=r"(c), "=r"(d)
     );
}

compiles to

0000000000000000 <foo>:
mov    $0x0,%al
mov    $0x1,%si
mov    $0x2,%ecx
mov    $0x3,%rdx
ret

so it looks to me like 8bit variables get 8bit registers and so on.

>> All 32bit ops zero out the high bits of the destination. And even if they
>> didn't (e.g. if sum was 16bit), gcc will add any necessary extension.
>
> So movd %mm0, %rax will zero the high bits?  Because movd %mm0, %mm1 won't.
> Or do you mean that movd %mm0, %eax will zero the high bits of %rax?

I meant ops involving 32bit general purpose registers.

movd %mm0, %rax  copies the whole 64bits to rax.
movd %mm0, %eax  copies the low 32bits and zeros the high 32bits of rax.
movd %mm0, %mm1  fails to assemble. movd can only take a mmx or xmm reg 
and a 32bit or 64bit gpr, not two mmx regs.

--Loren Merritt