[Ffmpeg-devel] Fixed vs. Floating Point AAC
Loren Merritt
lorenm
Fri Mar 10 07:08:48 CET 2006
On Thu, 9 Mar 2006, Rich Felker wrote:
> On Thu, Mar 09, 2006 at 04:00:54PM -0800, Loren Merritt wrote:
>> On Thu, 9 Mar 2006, Rich Felker wrote:
>>>
>>> How can 32*32>>32 possibly be slower than 32*32? It's just a matter of
>>> whether you read the result from eax or edx...
>>
>> It's a matter of whether you put the source in eax and read the result
>> from edx (32*32>>32) or use any two registers you want (32*32).
>
> Are you sure? Last I checked the x86 always used eax for the low 32
> bits of a product. There is a special instruction that does not store
> the high 32 bits to edx, but I don't think you get to choose where the
> low 32 bits go. Apologies if I'm mistaken on this..
"imul foo" calculates foo*eax and stores the 64bit result in edx:eax
"imul foo,bar" calculates foo*bar and stores the low 32bits in foo.
>> In the latency test, the extra 2x mov are expensive.
>
> This is a gcc bug -- not choosing good registers. Try the comparison
> with asm next time..
Not a bug. The benchmark code is a sequence of multiplies, and nothing
else. You can't allocate all the variables to eax. Yes, in real code,
register allocation may help.
>> And in the throughput
>> test, the extra reg caused spillage too.
>
> This is an actual issue.
--Loren Merritt
More information about the ffmpeg-devel
mailing list