[FFmpeg-devel] [PATCH] Add check for Athlon64 and similar AMD processors with slow SSE2.

Jason Garrett-Glaser jason
Sun Feb 6 06:19:39 CET 2011


On Sat, Feb 5, 2011 at 7:23 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> Hi,
>
> On Sat, Feb 5, 2011 at 10:04 PM, Jason Garrett-Glaser <jason at x264.com> wrote:
>> On Sat, Feb 5, 2011 at 5:46 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>>> Hi,
>>>
>>> On Fri, Feb 4, 2011 at 1:03 PM, Justin Ruggles <justin.ruggles at gmail.com> wrote:
>>>> On 02/04/2011 12:27 PM, Ronald S. Bultje wrote:
>>>>> I'm not against the original idea of reusing SSE2SLOW, just make sure
>>>>> it's properly documented.
>>>>> - SSE2 - CPU supports good SSE2
>>>>> - SSE2SLOW (core1 etc.) - CPU supports SSE2 in theory but it's almost
>>>>> always slower - only set SSE2 functions if explicitely tested to be
>>>>> faster
>>>>> - SSE2|SSE2SLOW (athlon64 etc.) - CPU supports SSE2 but it's
>>>>> occasionaly slower - don't set SSE2 functions if explicitely tested to
>>>>> be slower
>>>>>
>>>>> And I thought that's what your patch did.
>>>>
>>>>
>>>> It did. But I think it made one of the flag checks more complicated.
>>>>
>>>> all sse2:
>>>> flags & (SSE2 | SSE2SLOW)
>>>>
>>>> exclude core 1 only:
>>>> flags & SSE2
>>>>
>>>> exclude core 1 and athlon64:
>>>> (flags & SSE2) && !(flags & SSE2SLOW)
>>>> or
>>>> (flags & (SSE2 | SSE2SLOW)) ^ SSE2SLOW
>>>
>>> flags & (SSE2|SSE2SLOW) == SSE2,
>>>
>>> (^ SSE2SLOW only flips the slow bit, and then if either bit is non-zero, etc.)
>>>
>>>> exclude athlon64 only:
>>>> (flags & (SSE2 | SSE2SLOW)) && !(flags & SSE2 && flags & SSE2SLOW)
>>>> or
>>>> (flags & (SSE2 | SSE2SLOW)) ^ (SSE2 | SSE2SLOW)
>>>>
>>>> The first 3 are self-explanatory, but the last case is not.
>>>
>>> I don't think it matters. When would you ever want to exclude
>>> Athlon64, but not Core1?
>>
>> Almost any SSE2 function?
>
> Isn't that the other way around?

Oh, I misread what you wrote.  Correct.

Jason



More information about the ffmpeg-devel mailing list