[FFmpeg-devel] [PATCH] ff_scalarproduct_float_sse

Wed Jan 20 21:43:25 CET 2010

On Jan 20, 2010, at 1:39 PM, Reimar D?ffinger wrote:

> On Wed, Jan 20, 2010 at 08:59:44AM -0800, Jason Garrett-Glaser wrote:
>> 2010/1/20 M?ns Rullg?rd <mans at mansr.com>:
>>> Michael Niedermayer <michaelni at gmx.at> writes:
>>> 
>>>> On Tue, Jan 19, 2010 at 11:42:40PM -0500, Alex Converse wrote:
>>>>> This cause a >50% decrease in SBR decode time.
>>>>> 
>>>>> For the time being it can help in the other places where
>>>>> scalarproduct_float() is used.
>>>>> 
>>>>> Regards,
>>>>> Alex Converse
>>>> 
>>>>>  dsputil_mmx.c    |    5 +++++
>>>>>  dsputil_yasm.asm |   25 +++++++++++++++++++++++++
>>>> 
>>>> Would you mind to avoid yasm and use gcc asm instead ?
>>>> 
>>>> I have no problem with yasm as such but gcc asm is more portable and
>>>> can be integrated with C code if we ever want that.
>>> 
>>> I have to disagree.  Just look at how many FATE targets broke with
>>> your change to h264_loop_filter_strength_mmx2 yesterday.  Several
>>> compilers are still failing to build it.
>>> 
>>> I'm not aware of any serious OS on which yasm doesn't run, so the
>>> portability argument doesn't hold water.
>> 
>> Furthermore, do note that all current non-yasm SSE assembly in ffmpeg
>> is broken on Win64...
> 
> How and why? I may just not have run into it (not tested much), but my Win64
> MPlayer build seemed to work fine.

Win64 ABI has several xmm registers as callee-saved, and none of the inline asm marks them as clobbered.