[FFmpeg-devel] [RFC] clobbers for XMM registers
Thu Sep 30 18:59:54 CEST 2010
Alexander Strange <astrange at ithinksw.com> writes:
> On Thursday, September 30, 2010, M?ns Rullg?rd <mans at mansr.com> wrote:
>> "Ronald S. Bultje" <rsbultje at gmail.com> writes:
>>> 2010/9/28 M?ns Rullg?rd <mans at mansr.com>:
>>>> Michael Niedermayer <michaelni at gmx.at> writes:
>>>>> On Tue, Sep 28, 2010 at 09:36:40AM -0400, Ronald S. Bultje wrote:
>>>>>> On Tue, Sep 28, 2010 at 8:34 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>>>>>> > you want to execute code from vp3dsp_sse2.c on a pre SSE cpu?
>>>>>> All _sse2 files are templates files that are included in dsputil_mmx.c
>>>>>> or similar.
>>>>> we could add the flags to dsputil_mmx then
>>>> That would allow the compiler to use SSE instructions in functions
>>>> that should be MMX only.
>>> I'm gonna start kicking this subject until it's solved. Come on guys,
>>> keep this moving. Why don't we make it (the clobbering) a macro and
>>> only enable this on x86-64. Don't forget all xmm registers are
>>> caller-save on x86-32 and x86-64 has no issues with marking clobbers
>> The issue is not fundamentally about caller vs callee saved
>> registers. It is about telling the compiler which registers are
>> clobbered, so that it can save and restore them if necessary.
>> The missing clobber lists caused the FFT to fail with suncc, despite
>> all the used registers being caller-saved. Apparently the compiler
>> was using them for something outside the asm block.
>>> (and even if it did, -msse is fine, there is no single x86-64 CPU that
>>> does not support SSE). We could consider making it as simple as :::
>>> CLOBBER_IF_X86_64("%xmm6", "%xmm7",) "%eax" which evaluates to the
>>> string in it (including commas) on x86-64 and nothing on x86-32 (and
>>> omit the comma if that's the only thing in the clobberlist).
>> We obviously need a conditional of some kind, but it should be tested
>> in configure and applied whenever the compiler recognises xmm registers.
>> It is, however, not quite as straight forward as you make it out.
>> Stray commas are not allowed, nor is an empty list.
>> One possible solution is to have the macro always include "cc". Most
>> of the asm blocks do clobber the condition flags, and for any that do
>> not, it is unlikely to make any difference. It also seems that
>> including the stack pointer in the clobber list is ignored, although
>> relying on this seems dubious at best.
> asm blocks always clobber cc whether or not you put it in the list, so
> the "cc" clobber is a no-op.
In that case always adding it is certainly harmless, and allows a
single macro to be used. Where is this documented BTW? I couldn't
find anything on the specifics of "cc" clobbers on x86.
mans at mansr.com
More information about the ffmpeg-devel