[FFmpeg-devel] [PATCH] some SIMD write-combining for h264

Alexander Strange astrange
Mon Jan 25 03:06:59 CET 2010


On Jan 24, 2010, at 2:47 PM, Michael Niedermayer wrote:

> On Sun, Jan 24, 2010 at 12:58:47AM -0500, Alexander Strange wrote:
>> 
>> On Jan 18, 2010, at 5:34 AM, Alexander Strange wrote:
>> 
>>> 
>>> On Jan 17, 2010, at 8:27 PM, M?ns Rullg?rd wrote:
>>> 
>>>> Alexander Strange <astrange at ithinksw.com> writes:
>>>> 
>>>>> On Sun, Jan 17, 2010 at 7:54 PM, Carl Eugen Hoyos <cehoyos at ag.or.at> wrote:
>>>>>> Alexander Strange <astrange <at> ithinksw.com> writes:
>>>>>> 
>>>>>>>>>> also what sets __MMX__ ? we have our own defines for that
>>>>>>>>> 
>>>>>>>>> It's a gcc builtin define, set based on ./configure --cpu=x adding
>>>>>>>>> -march.  HAVE_MMX is for the build and not the host cpu family, and
>>>>>>>>> this is inlined asm, so it can't use it.
>>>>>>>> 
>>>>>>>> Huh?  Host... build???
>>>>>>> 
>>>>>>> Oh, that was supposed to be "target"...
>>>>>>> Anyway, this is MMX being used like the cmov/clz inlines, so it depends on the
>>>>>>> given --cpu and not on the build system's capabilities.
>>>>>> 
>>>>>> Could you explain once more why this shouldn't be HAVE_MMX?
>>>>>> If the user passes --disable-mmx to configure, he imo expects MMX to be disabled.
>>>>>> 
>>>>>> Carl Eugen
>>>>> 
>>>>> HAVE_MMX isn't enough to enable it - './configure --cpu=i586' enables
>>>>> HAVE_MMX, but i586 doesn't have it.
>>>> 
>>>> Not anymore.
>>> 
>>> I think that was wrong. --cpu is the minimum required cpu, not the only expected cpu, but that turned off building dsputil mmx which is runtime-cpudetected. (even if runtime-cpudetection is disabled, actually)
>>> Besides, './configure' also sets HAVE_MMX but targets generic x86-32 which might not have it.
>>> 
>>>>> Technically I'd say --disable-mmx should pass -mno-mmx to gcc, but
>>>>> that seems like a complicated change to configure, so I'll check
>>>>> HAVE_MMX to disable it as well.
>>>> 
>>>> That's almost trivial to arrange.  Do we want that?
>>> 
>>> The other new architectures would have to be added as configure options (sse[234] are missing), but if you want then sure.
>>> 
>>> Applied the first patch with #if HAVE_MMX around the file and av_always_inline.
>>> The second patch needs to be rewritten now since the decoder changed under it.
>> 
>> Here it is, with some more opportunities I found.
>> The original file now goes from 12.236s to 11.862s on x86-32 (3% faster).
> 
> iam fine with the patch once the types are correct to avoid warnings

Applied with proper & and then fixed xchg_mb_border.

I'd better run the full h264 regression more often, the file I thought was MBAFF apparently wasn't MBAFF enough...



More information about the ffmpeg-devel mailing list