[FFmpeg-devel] [RFC] clobbers for XMM registers
Thu Sep 30 17:28:14 CEST 2010
"Ronald S. Bultje" <rsbultje at gmail.com> writes:
> 2010/9/28 M?ns Rullg?rd <mans at mansr.com>:
>> Michael Niedermayer <michaelni at gmx.at> writes:
>>> On Tue, Sep 28, 2010 at 09:36:40AM -0400, Ronald S. Bultje wrote:
>>>> On Tue, Sep 28, 2010 at 8:34 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>>>> > you want to execute code from vp3dsp_sse2.c on a pre SSE cpu?
>>>> All _sse2 files are templates files that are included in dsputil_mmx.c
>>>> or similar.
>>> we could add the flags to dsputil_mmx then
>> That would allow the compiler to use SSE instructions in functions
>> that should be MMX only.
> I'm gonna start kicking this subject until it's solved. Come on guys,
> keep this moving. Why don't we make it (the clobbering) a macro and
> only enable this on x86-64. Don't forget all xmm registers are
> caller-save on x86-32 and x86-64 has no issues with marking clobbers
The issue is not fundamentally about caller vs callee saved
registers. It is about telling the compiler which registers are
clobbered, so that it can save and restore them if necessary.
The missing clobber lists caused the FFT to fail with suncc, despite
all the used registers being caller-saved. Apparently the compiler
was using them for something outside the asm block.
> (and even if it did, -msse is fine, there is no single x86-64 CPU that
> does not support SSE). We could consider making it as simple as :::
> CLOBBER_IF_X86_64("%xmm6", "%xmm7",) "%eax" which evaluates to the
> string in it (including commas) on x86-64 and nothing on x86-32 (and
> omit the comma if that's the only thing in the clobberlist).
We obviously need a conditional of some kind, but it should be tested
in configure and applied whenever the compiler recognises xmm registers.
It is, however, not quite as straight forward as you make it out.
Stray commas are not allowed, nor is an empty list.
One possible solution is to have the macro always include "cc". Most
of the asm blocks do clobber the condition flags, and for any that do
not, it is unlikely to make any difference. It also seems that
including the stack pointer in the clobber list is ignored, although
relying on this seems dubious at best.
mans at mansr.com
More information about the ffmpeg-devel