[FFmpeg-devel] [RFC] clobbers for XMM registers

Ramiro Polla ramiro.polla
Mon Oct 4 04:40:35 CEST 2010


On Sun, Oct 3, 2010 at 4:43 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> On Sun, Oct 3, 2010 at 3:20 PM, Ramiro Polla <ramiro.polla at gmail.com> wrote:
>> On Thu, Sep 30, 2010 at 10:50 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
>>> On Thu, Sep 30, 2010 at 08:49:04PM -0400, Alexander Strange wrote:
>>>> On Sep 30, 2010, at 8:31 PM, Michael Niedermayer wrote:
>>>> > On Fri, Oct 01, 2010 at 12:42:33AM +0100, M?ns Rullg?rd wrote:
>>>> >> Michael Niedermayer <michaelni at gmx.at> writes:
>>>> >>> On Thu, Sep 30, 2010 at 04:02:00PM -0300, Ramiro Polla wrote:
>>>> >>>> 2010/9/30 M?ns Rullg?rd <mans at mansr.com>:
>>>> >>>>> Ramiro Polla <ramiro.polla at gmail.com> writes:
>>>> >>>>>> 2010/9/30 M?ns Rullg?rd <mans at mansr.com>:
>>>> >>>>>>> Ramiro Polla <ramiro.polla at gmail.com> writes:
>>>> >>>>>>>> What about
>>>> >>>>>>>> #if HAVE_XMM_CLOBBERS
>>>> >>>>>>>> # ? ?define XMM_CLOBBERS(a, ...) __VA_ARGS__
>>>> >>>>>>>> #else
>>>> >>>>>>>> # ? ?define XMM_CLOBBERS(a, ...) a
>>>> >>>>>>>> #endif
>>>> >>>>>>>>
>>>> >>>>>>>> to be used as in lavc/x86/fft_sse.c:
>>>> >>>>>>>> ? ? ? ? :"+r"(j), "+r"(k)
>>>> >>>>>>>> ? ? ? ? :"r"(output+n4), "r"(output+n4*3),
>>>> >>>>>>>> ? ? ? ? ?"m"(*m1m1m1m1)
>>>> >>>>>>>> ? ? ? ? XMM_CLOBBERS(, : "%xmm0", "%xmm1", "%xmm7")
>>>> >>>>>>>> ? ? );
>>>> >>>>>>>
>>>> >>>>>>> That falls over if any other clobbers are needed.
>>>> >>>>>>
>>>> >>>>>> If any other clobbers are needed they could be written before the macro.
>>>> >>>>>
>>>> >>>>> That won't work.
>>>> >>>>
>>>> >>>> : "eax" XMM_CLOBBERS(,, "%xmm0", "%xmm1", "%xmm7")
>>>> >>>>
>>>> >>>> But I agree having "cc" as dummy is simpler.
>>>> >>>
>>>> >>> it may be simpler but your variant is more correct
>>>> >>
>>>> >> If "cc" is a no-op neither variant is more correct. ?Mine is a hell of
>>>> >> a lot prettier though.
>>>> >
>>>> > unless "cc" is documented to be a no-op on x86 we should not assume it is
>>>>
>>>> svn blame on gcc says it hasn't changed since x86 was rewritten in 1999.
>>>> (the code in question being ix86_md_asm_clobbers(), in a file too big to display in viewcvs)
>>>>
>>>> I hardly see how it could change now since it would break all users of asm.
>>>
>>> i agree with you still i dont think the risk with cc byting us is 0 and theres
>>> a perfectly working solution, namely ramiros macro.
>>> and it wont cost us anything to use ramiros macro now. Changing it later will
>>> cost us man hours.
>>
>> So to keep this thread going, 3 options have been suggested:
>> 1- using "cc" as a dummy clobber
>> 2- using 2 macros
>> 3- using 1 macro that has both clobber lists (with and without xmm_support)
>>
>> 3 will inevitably lead someone to write XMM_CLOBBERS("xmm6", "xmm7")
>> without the first leading comma which is not intuitive so I consider
>> it the worst option. 2 seems nice to me but apparently just me. 1 is
>> more elegant at the cost of depending on "cc" being a no-op on i386 (I
>> couldn't find any occurrence in gcc's source code where it did matter
>> though).
>>
>> Ronald, what's your choice?
>
> 2 is fine with me also. It's not prettiest, but (imo) our asm isn't
> supposed to be pretty, it's supposed to be rock-solid (and fast). If
> you want something pretty, you use C or some fancy
> scripting-language-of-the-day. Can you re-post your patch plus add a
> few lines of doxy that include an example (yes, I want the docs to
> include an example) of which to use when and how, plus your patch that
> makes this used in fft? I'll start working on fixing the remaining
> dsputil/enc functions then (that'll take some time since I need to
> merge various asm blocks).

Attached. I'm not happy with the documentation though, feel free to improve it.

> Let's fix/nish this for good. At least then I can shout at anyone
> committing inline asm without proper clobber marking in the future
> (e.g. that vc1 patch from two days ago).

Something else seems to have broken more win64 stuff. That
win64-specific fate configuration started failing more tests between
25296 and 25300.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xmm_clobbers_1.diff
Type: application/octet-stream
Size: 1670 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20101003/8b7b22f6/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fft_sse_xmm_clobbers.diff
Type: application/octet-stream
Size: 370 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20101003/8b7b22f6/attachment-0001.obj>



More information about the ffmpeg-devel mailing list