[FFmpeg-devel] [PATCH] Add check for Athlon64 and similar AMD processors with slow SSE2.
Thu Feb 3 00:47:30 CET 2011
On 02/02/2011 06:40 PM, M?ns Rullg?rd wrote:
> Justin Ruggles <justin.ruggles at gmail.com> writes:
>> This was ported from x264 so we need permission to relicense from
>> Loren, Jason, or whoever added this particular check in x264.
> I doubt there's much copyright on checking a single bit, but it's
> always nice to ask first.
>> Once this is added, I can go through libavcodec/x86 and test various
>> SSE2 functions on my Athlon64 to check if they're faster with other
>> versions. We also have a couple functions now that are hackishly
>> handled by disabling SSE2 for all AMD processors by checking the
>> 3DNow flag. Those are the ones I'll test first.
>> libavutil/x86/cpu.c | 7 +++++++
>> 1 files changed, 7 insertions(+), 0 deletions(-)
>> diff --git a/libavutil/x86/cpu.c b/libavutil/x86/cpu.c
>> index 4b6cb0d..7e847c1 100644
>> --- a/libavutil/x86/cpu.c
>> +++ b/libavutil/x86/cpu.c
>> @@ -109,6 +109,13 @@ int ff_get_cpu_flags_x86(void)
>> rval |= AV_CPU_FLAG_MMX;
>> if (ext_caps & (1<<22))
>> rval |= AV_CPU_FLAG_MMX2;
>> + /* Allow for selectively disabling SSE2 functions on AMD processors
>> + with SSE2 support but not SSE4a. This includes Athlon64,
>> + some Opteron, and some Sempron processors. MMX, SSE, or 3DNow! are
>> + faster than SSE2 often enough to utilize this special-case flag. */
>> + if (rval & AV_CPU_FLAG_SSE2 && !(ecx & 0x00000040))
>> + rval |= AV_CPU_FLAG_SSE2SLOW;
>> if (!strncmp(vendor.c, "GenuineIntel", 12) &&
> Why does this code only affect AMD processors? I guess it has
> something to do with those magic bit checks before it...
That is a guess on my part. I don't know much about cpuid. I assume we
wouldn't check for 3DNow on a non-AMD processor, but that may be an
incorrect assumption. x264 also checks the AMD vendor string in their
equivalent of this whole section, but FFmpeg does not.
More information about the ffmpeg-devel